Lecture notes for Introduction to SPDE, Spring 2016

Similar documents
Exercises. T 2T. e ita φ(t)dt.

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

In terms of measures: Exercise 1. Existence of a Gaussian process: Theorem 2. Remark 3.

A Concise Course on Stochastic Partial Differential Equations

I forgot to mention last time: in the Ito formula for two standard processes, putting

Stochastic Differential Equations.

ELEMENTS OF PROBABILITY THEORY

Bernardo D Auria Stochastic Processes /10. Notes. Abril 13 th, 2010

Wiener Measure and Brownian Motion

MA8109 Stochastic Processes in Systems Theory Autumn 2013

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Brownian Motion. Chapter Stochastic Process

A connection between the stochastic heat equation and fractional Brownian motion, and a simple proof of a result of Talagrand

Lecture 21 Representations of Martingales

A Short Introduction to Diffusion Processes and Ito Calculus

GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Partial Differential Equations

Verona Course April Lecture 1. Review of probability

(B(t i+1 ) B(t i )) 2

Nonlinear Analysis 71 (2009) Contents lists available at ScienceDirect. Nonlinear Analysis. journal homepage:

Stochastic Differential Equations

Bernardo D Auria Stochastic Processes /12. Notes. March 29 th, 2012

Regularity of the density for the stochastic heat equation

Some Background Material

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

Gaussian Processes. 1. Basic Notions

Solutions to the Exercises in Stochastic Analysis

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

Probability and Measure

{σ x >t}p x. (σ x >t)=e at.

z x = f x (x, y, a, b), z y = f y (x, y, a, b). F(x, y, z, z x, z y ) = 0. This is a PDE for the unknown function of two independent variables.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Metric Spaces and Topology

4th Preparation Sheet - Solutions

Finite element approximation of the stochastic heat equation with additive noise

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Brownian Motion and Conditional Probability

The Chaotic Character of the Stochastic Heat Equation

JUHA KINNUNEN. Harmonic Analysis

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

The first order quasi-linear PDEs

Convergence at first and second order of some approximations of stochastic integrals

(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS

On pathwise stochastic integration

IEOR 4701: Stochastic Models in Financial Engineering. Summer 2007, Professor Whitt. SOLUTIONS to Homework Assignment 9: Brownian motion

Introduction and Preliminaries

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

Basic Definitions: Indexed Collections and Random Functions

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have

Rough paths methods 4: Application to fbm

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

Stochastic Integration.

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

FE 5204 Stochastic Differential Equations

An Overview of the Martingale Representation Theorem

1 Independent increments

µ X (A) = P ( X 1 (A) )

Hölder continuity of the solution to the 3-dimensional stochastic wave equation

WELL-POSEDNESS FOR HYPERBOLIC PROBLEMS (0.2)

p 1 ( Y p dp) 1/p ( X p dp) 1 1 p

Math 342 Partial Differential Equations «Viktor Grigoryan

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Applications of Ito s Formula

Chapter One. The Calderón-Zygmund Theory I: Ellipticity

Stochastic Calculus and Black-Scholes Theory MTH772P Exercises Sheet 1

Measurable functions are approximately nice, even if look terrible.

The stochastic heat equation with a fractional-colored noise: existence of solution

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

56 4 Integration against rough paths

9 Brownian Motion: Construction

Topics in fractional Brownian motion

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

Man Kyu Im*, Un Cig Ji **, and Jae Hee Kim ***

STAT 7032 Probability Spring Wlodek Bryc

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Fast-slow systems with chaotic noise

Lecture 19 L 2 -Stochastic integration

Moments for the parabolic Anderson model: on a result by Hu and Nualart.

Sobolev spaces. May 18

LECTURE 5: THE METHOD OF STATIONARY PHASE

13. Fourier transforms

On continuous time contract theory

n E(X t T n = lim X s Tn = X s

Example 4.1 Let X be a random variable and f(t) a given function of time. Then. Y (t) = f(t)x. Y (t) = X sin(ωt + δ)

Conservation law equations : problem set

Fourier transforms, I

DISTRIBUTION OF THE SUPREMUM LOCATION OF STATIONARY PROCESSES. 1. Introduction

Functional Analysis Exercise Class

MATH 205C: STATIONARY PHASE LEMMA

Regularization by noise in infinite dimensions

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Stochastic process for macro

The Codimension of the Zeros of a Stable Process in Random Scenery

SOLUTIONS TO HOMEWORK ASSIGNMENT 4

Existence and Uniqueness

Poisson random measure: motivation

Transcription:

Lecture notes for Introduction to SPDE, Spring 216 Lenya yzhik May 6, 216 Nothing found here is original except for a few mistakes and misprints here and there. These notes are simply a record of what I cover in class, to spare the students the necessity of taking the lecture notes. The readers should consult the original books for a better presentation and context. We plan to follow the lecture notes by Davar Khoshnevisan and the book by John Walsh. 1 The white noise We would like to be able to make sense of the solutions of equations with rough forces, such as the heat equation = u + F (t, x), (1.1) t or the wave equation 1 2 u = u + F (t, x), (1.2) c 2 t2 with a highly irregular function F (t, x), as well as nonlinear versions of these equations. However, if F is very rough then, presumably, the solution u(t, x) will not be very smooth either, hence one would not be able to differentiate it in time or space, and the sense in which u(t, x) solves the corresponding equation is not quite clear. It is natural to think of u(t, x) as a weak solution of the PDE as that would not require differentiation but if the PDE (unlike the examples above) is nonlinear we would still need to know that u(t, x) is a function, which is not a priori obvious if the force F (t, x) is too irregular. As we will see, it is often the case that u(t, x) is actually not a function. Another obvious issue is to understand what we would mean by a rough force. If, say, (1.1) were posed on the lattice Z d, and were a discrete Laplacian, then (1.1) would be an infinite system of SDE s at each lattice site, with F (t, x) independent for each site x Z d. This makes clear sense, as long as x is discrete. In order to define this in d, we need to develop some basics. A natural way to generalize independence at each site is to require that F (t, x) is a stationary in time and space mean-zero process such that the two-point correlation function is E[F (t, x)f (t, x )] = { 1 if t = t and x = x, otherwise. Department of Mathematics, Stanford University, Stanford, CA 9435, USA; ryzhik@math.stanford.edu (1.3) 1

Such random field, however, does not exist, and the next best choice would be a random field with a correlation function E[F (t, x)f (t, x )] = δ(t t )δ(x x ). (1.4) The first order of business would be to make this rigorous. When is the solution a function? Disregarding the question of a careful definition of such random process, let us see what we can expect about the solutions of, say, the heat equation (1.1) with such force. The Duhamel formula says that if u(, x) =, then the solution of (1.1) in d is t u(t, x) = G(t s, x y)f (s, y)dsdy. (1.5) d Here, G(t, x) = 1 (4πt) d/2 e x 2 /(4t) (1.6) is the standard heat kernel. The function u(t, x) is a stationary field in x, hence we can not possibly expect any decay in x but we may still ask how large it should be. Let us compute its point-wise second moment: ] t t E [ u(t, x) 2 = t = G(t s, x y) 2 dsdy = d = C d t ds s d/2. G(t s, x y)g(t s, x y )E[F (s, y)f (s, y )]dsds dydy 2d t G(s, y) 2 dsdy = 1 t e y 2 /(2s) dyds (4π) d d s d d (1.7) We see that u(t, x) has a point-wise second moment if and only if d = 1 therefore, it is only in one dimension that we may expect the solution of a typical SPDE be a function. Is the solution a distribution? Since u(t, x) does not seem to be a function in d > 1, let us see if the field given by (1.5) at least makes sense as a distribution in x, pointwise in time. We multiply (1.5) by a test function φ Cc ( d ): t Φ(t) = u(t), φ = G(t s, x y)f (s, y)φ(x)dsdydx, d d and compute the second moment of Φ(t): ] t t E [ Φ(t) 2 = G(t s, x y)g(t s, x y )E[F (s, y)f (s, y )] (1.8) 4d t φ(x)φ(x )dsds dydy dxdx = G(t s, x y)g(t s, x y)φ(x)φ(x )dsdydxdx 3d 2

= = t t ( )( ) G(t s, y x)φ(x)dx G(t s, y x )φ(x )dx ds d d d t v 2 (t s, y)dyds = v 2 (s, y)dyds. d d Here, the function v(s, y) is the solution of the heat equation v t = v, (1.9) with the initial condition v(, y) = φ(y). As φ C c ( d ), the function v(t, x) is most beautifully smooth and rapidly decaying. It follows that Φ(t) has a finite second moment, meaning that it is likely one can make sense of (1.5) to make sense as a distribution in any dimension. Thus, in dimensions d 2 one, generally, would expect solutions of SPDEs to be distributions and not functions. This is a problem as we will often be interested in solutions of nonlinear SPDEs, and we do not know how to take nonlinear functions of distributions. In particular, SPDEs often arise as limit descriptions of the densities of systems of N particles. It is typical that in such models the particle density converges to the solution of a deterministic PDE as N +, such as, in the simplest case, the heat equation: t = u. (1.1) Accounting for a large but finite number of particles often leads to an SPDE that is a perturbation of the limiting deterministic problem, such as t = u + Noise. (1.11) In that setting, the noise term typically has variance proportional, locally, to the total number of particles, that is, to u(t, x) this is a version of the central limit theorem Thus, the equation would have the form t = u + u Noise. (1.12) Hence, we would need to deal not only with nonlinearities but also with non-lipschitz nonlinearities. The randomly forced wave equation Let us, as a next example, informally, consider the wave equation with the initial condition 2 v t 2 v = F (t, x), (1.13) 2 x2 v(, x) = v(, x) t =. 3

Its solution is v(t, x) = 1 2 t x+(t s) x (t s) F (s, y)dyds. (1.14) Let us now assume that F (s, y) is a white noise, as i the heat equation example. Then, the solution v(t, x) is the value of the white noise as a distribution on the characteristic function of the triangle, which is the domain of integration in (1.14) therefore, we arrive at the same need to understand the white noise F (t, x) as a distribution. In higher dimensions, similar expressions for the solution of the forced wave equation can be obtained via spherical means. Gaussian processes We now start being more careful than in the above informal discussion. A stochastic process G(t), t T, indexed by a set T is a Gaussian random field if for every finite collection t 1,..., t k T, the vector (G(t 1 ),..., G(t k )) is a Gaussian random vector. The Kolmogorov consistency theorem implies that the finite-dimensional distributions of G are uniquely determined by the mean: µ(t) = E(G(t)), and the covariance C(s, t) = E[(G(s) µ(s))(g(t) µ(t))]. The covariance function is non-negative definite in the following sense: for any t 1,..., t k and z 1,..., z k C, we have k C(t j, t m )z j z m. (1.15) This is because E j,m=1 k 2 (G(t j ) µ(t j ))z j = j=1 k j,m=1 C(t j, t m )z j z m. A classical result of Kolmogorov is that given any function µ(t) and a nonnegative-definite function C(s, t) one can construct a Gaussian random field G(t) with mean µ(t) and covariance C(s, t). Example 1: the Brownian motion. One of the most basic examples of a Gaussian process is the Brownian motion. In that case, T = [, + ), µ(t), and C(s, t) = min(s, t). Let us check that this covariance is nonnegative-definite: k j,m=1 z j z m min(t j, t m ) = k j,m=1 + z j z m χ [,tj ](s)χ [,tm](s)ds = + k z j χ [,tj ](t) 2 dt. Of course, the Brownian motion can be constructed in many other ways, without invoking general abstract theorems. Example 2: the Brownian bridge. The Brownian bridge b(t) is a process on the interval T = [, 1], with mean-zero: µ(t) =, and the covariance C(s, t) = E(b(s)b(t)) = s t st. (1.16) 4 j=1

A remarkable property of the Brownian bridge is that E(b() 2 ) = E(b(1) 2 ) =. (1.17) That is, the process b(s) starts at b() = and ends at b(1) =, almost surely. Instead of verifying directly that the function C(s, t) in (1.16) is nonnegative-definite, we observe that if B(t) is a Brownian motion, then b(t) = B(t) tb(1) (1.18) is a Brownian bridge. Indeed, it clearly has mean-zero, and for any s.t 1, we have E(b(s)b(t)) = E[(B(s) sb(1))(b(t) tb(1))] = s t st ts + st = s t st. Example 3: the Ornstein-Uhlenbeck process. The Brownian motion does not have a finite invariant measure it typically runs away to infinity. In order to confine it, let us define the process X(t) = e t/2 B(e t ), (1.19) for t. The process X(t) is mean-zero, and has the covariance for t s: ) C(s, t) = e (s+t)/2 min (e s, e t = e (t s)/2. (1.2) In other words, we have, for all t and s : C(s, t) = e t s /2. (1.21) In particular, C(s, t) depends only on t s such processes are called stationary Gaussian processes. We also have E(X 2 (t)) = 1, (1.22) for all t, which indicates that X(t) is, indeed, confined in some sense. Example 4: a general white noise. In general, we define a white noise as follows. Let E be a set endowed with a measure ν and a collection M of measurable sets. Then a white noise is a random function on the sets A M of a finite ν-measure such that Ẇ (A) is a mean-zero Gaussian random variable with E(Ẇ (A)2 ) = ν(a), and if A B =, then Ẇ (A) and Ẇ (B) are independent, with Then the covariance of Ẇ is Ẇ (A B) = Ẇ (A) + Ẇ (B). E(Ẇ (A)Ẇ (B))= E[(Ẇ (A B)+Ẇ (A\B))(Ẇ (A B)+Ẇ (B\A))]= E[Ẇ (A B)2 ]= ν(a B). 5

This function is nonnegative-definite because k z i z j C(A i, A j ) = i,j=1 k z i z j ν(a i A j ) = i,j=1 = E k i,j=1 E k z j χ Aj (x) 2 dν(x). j=1 z i z j χ Ai (x)χ Aj (x)dν(x) Example 5: a Brownian sheet. The Brownian motion W (t) can now be alternatively defined as follows. Take E = +, and ν the Lebesgue measure on, and set W (t) = Ẇ ([, t]). This gives E(W (t)) =, and E(W (t) 2 ) = [, t] [, s] = s t. A Brownian sheet W (t), with t n +, can be defined as above, taking E = n +, and ν the Lebesgue measure on n. For t = (t 1,..., t n ) n, we set and define the Brownian sheet as [, t] = [, t 1 ]... [, t n ], W (t) = Ẇ ([, t]). Exercise. Consider n = 2 and denote t = (t 1, t 2 ). (i) Show that if t 1 is fixed, then W t1,t is a Brownian motion. (ii) Show that on the hyperbole t 1 t 2 = 1 we have that X t = W e t,e t is an Ornstein-Uhlenbeck process. (iii) Show that on the diagonal the process W t,t is a martingale, has independent increments but is not a Brownian motion. The white noise does not have a bounded total variation Let us note that Ẇ does not have a bounded total variation almost surely. The proof is very much as what we do in the proof of the Ito formula. To see this, we first show that almost surely. Indeed, let us set then 2 n 1 lim n j= S n = ES n = Ẇ 2 n 1 j= ([ j 2 n, j + 1 2 n ]) 2 = 1, (1.23) Ẇ 2 n 1 j= ([ j 2 n, j + 1 2 n ]) 2, [ j 2 n, j + 1 2 n ] = 1, 6

and E(S n 1) 2 = E = = 2 n 1 j,m= 2 n 1 j= = 2 n (3 ( 2 n 1 j= [ ([ j Ẇ 2, j + 1 ]) 2 [ j n 2 n 2, j + 1 ] ]) 2 n 2 n ([ j E{[ Ẇ 2, j + 1 ]) 2 [ j n 2 n 2, j + 1 ] ][ ([ m Ẇ n 2 n 2, m + 1 ]) 2 [ m n 2 n 2, m + 1 ] ]} n 2 n E{[ Ẇ ([ j 2, j + 1 ]) 2 [ j n 2 n 2, j + 1 ] ] 2 } = 2 n E{[ Ẇ ([, 1 ]) 2 [, 1 ] ] 2 } n 2 n 2 n 2 n 1 2 2n 2 2 2n + 1 2 2n ) = 1 2 n 1. We used the stationarity of the white noise in the next to last step above, and the fact that for a Gaussian random variable X we have in the last step. It follows that E(X 4 ) = 3(E(X 2 )) 2 P( S n 1 > ε) 1 2 n 1 ε 2. The Borel-Cantelli lemma implies that for every ε > the event { S n 1 > ε} occurs only for finitely many n, almost surely. Thus, S n 1 almost surely, or (1.23) holds. Exercise. Use (1.23) to obtain 2 n 1 lim n j= Ẇ ([ j 2 n, j + 1 2 n ]) = +, (1.24) also almost surely. It follows that Ẇ does not have a bounded total variation almost surely. 2 egularity of random processes We will now prove the Kolmogorov theorem that reduces the question of continuity of a stochastic process to a computation of some moments. ecall that a process X (t), t T, is a modification of a process X(t), t T if P [X (t) = X(t)] = 1 for all t T. Exercise. Construct an example where X is a modification of X but P [X (t) = X(t) for all t T] =. 7

Modulus of continuity from an integral inequality We first prove a real-analytic result that allows to translate an integral bound on a function into a point-wise modulus of continuity. We need some notation. In the theorem below, the functions ψ(x) and p(x), x, are even, p(x) is increasing for x >, with p() =, and ψ(x) is convex. We denote by 1 the unit cube in d. Theorem 2.1 Let f be a measurable function on 1 d such that ( f(y) f(x) ) B := p( y x / dxdy < +, (2.1) d) 1 1 ψ then there is a set K of measure zero such that if x, y 1 \ K, then f(y) f(x) 8 y x If f is continuous, then (2.2) holds for all x and y. Proof. We denote the side of a cube Q in 1 by e(q). Note that ψ 1 ( B u 2d )dp(u). (2.2) if x, y Q, then y x de(q). (2.3) The functions p and ψ are increasing for positive arguments, hence (2.1) and (2.3) imply ( f(y) f(x) ) ψ dxdy B, (2.4) p(e(q)) Q Q for any cube Q in 1. Next, take a nested sequence of cubes Q Q 1 Q 2... such that and denote p(e(q j )) = 1 2 p(e(q j 1)), (2.5) f j = 1 fdx, r j = e(q j ). Q j Q j It follows from (2.5) that r j, and the cubes converge down to a point. As the function ψ is convex, we have ( fj f ) j 1 ψ 1 ( fj f(x) ) dx p(r j 1 ) Q j 1 p(r j 1 ) by (2.4). We conclude that 1 Q j 1 Q j 1 Q j 1 Q j Q j 1 ψ Q j 1 Q j 1 Q j ψ Q j 1 ψ ( f(y) f(x) p(r j 1 ) ( f(y) f(x) p(r j 1 ) ) dxdy ) dxdy B Q j 1 Q j, ( f j f j 1 p(r j 1 )ψ 1 B ). (2.6) Q j 1 Q j 8

This starts to look like a modulus of continuity estimate but the two terms in the right side still compete one is small, the other is large. We now re-write it to make it look like the right side of (2.2). The definition (2.5) of Q j means that p(r j 1 ) = 4 p(r j+1 ) p(r j ). hence we may write ( f j f j 1 4ψ 1 B ) p(r j+1 ) p(r j ). (2.7) Q j 1 Q j Next, note that for r j+1 u r j, we have Q j 1 Q j u 2d, hence f j f j 1 4ψ 1 ( B u 2d ) p(r j+1 ) p(r j ), for all r j+1 u r j. (2.8) We deduce that Summing over j gives rj ( B ) f j f j 1 4 ψ 1 dp(u). (2.9) r j+1 u 2d r ( B lim sup f j f 4 ψ )dp(u). 1 (2.1) j + u 2d Now, by the Lebesgue theorem, except for x in an exceptional set K of measure zero, the sequence f j converges to f(x) for any sequence of cubes Q j decreasing to the point x. If x and y are not in K, and Q is the smallest cube containing both x and y, then, as r x y, we have both and proving (2.2). The Kolmogorov theorem f(x) f 4 f(y) f 4 x y x y ψ 1 ( B u 2d )dp(u), (2.11) ψ 1 ( B u 2d )dp(u), (2.12) We may now apply Theorem 2.1 to various stochastic processes. We begin with the Kolmogorov theorem. Theorem 2.2 Let X t, t T = [a 1, b 1 ]..., [a d, b d ] d be a real-valued stochastic process. Suppose there are constants k > 1, C > and ε > so that for all s, t T, we have E( X(t) X(s) k ) C t s d+ε. (2.13) Then X(t) has a continuous modification X(t). Moreover, X(t) has the following modulus of continuity: ( X(t) X(s) Y t s ε/k c ) 1 2/k, log (2.14) t s 9

with a deterministic constant c 1 >, and a random variable Y such that E(Y k ) C. Finally, if E( X t k ) < + for some t, then E( sup X t k ) < +. t T Proof. Without loss of generality we will assume that T is the unit cube Q 1. We will use Theorem 2.1 with ψ(x) = x k, and ( p(x) = x (2d+ε)/k log c ) 1 2/k. x This function is increasing on [, d] with an appropriately large choice of c 1. The function f in Theorem 2.1 is taken to be X(t; ω), for a fixed realization ω. This gives X(t; ω) X(s; ω) B(ω) = Q1 k [p( t s / X(t; ω) X(s; ω) dtds = C d)] Q1 k k t s 2d+ε log 2 (c 1 / t s ) dtds. Q 1 Taking the expectation, and using (2.13), we obtain E X(t; ω) X(s; ω) E(B) C Q1 k t s 2d+ε log 2 (c 1 / t s ) dtds C Q 1 Q 1 Q 1 1 Q 1 t s d log 2 (c 1 / t s ) dtds. The integral in the right, for a fixed t. and when c 1 is sufficiently large, behaves as 1/2 r n 1 r n log 2 r dr = log 2 dr r 2 < +. Therefore, E(B) <, and B is finite almost surely. Going back to (2.2) we get X(t) X(s) 8B 1/k t s (2.15) 1 u 2d/k p (u)du. (2.16) It is now a calculus an exercise to check that (2.14) holds with Y = CB 1/k. The last statement in the theorem is an immediate consequence of the modulus of continuity estimate (2.14) and the moment bound on Y. egularity of a Gaussian process Theorem 2.3 Let X t, t Q 1 d be a mean zero Gaussian process, and set If p(u) = 1 max s t u / E{ X t X s 2 } 1/2. (2.17) d ( log 1 u) 1/2dp(u) < +, then X has a continuous version with the modulus of continuity δ ( m(δ) C log u) 1 1/2dp(u) + Y p(δ), (2.18) with a universal constant C > and a random variable Y such that E(exp(c 1 Y 2 )) < + for a sufficiently small universal constant c 1 >. 1

Proof. We will use Theorem 2.1 with ψ(x) = e x 2 /4, and p(u) as in (2.17). The function p(u) has to satisfy the assumption of the aforementioned theorem otherwise, the modulus of continuity (2.18) becomes vacuous. Note that (2.17) implies that D st = X s X t p( t s / d) is a mean zero Gaussian variable with variance σ st 1. It follows that E(B) = E(exp(Dst/4))dsdt 2 C < +. Q 1 Q 1 Therefore, B < + a.s., and the rest is a direct application of Theorem 2.1 and integration by parts, as in the proof of Theorem 2.2. Note that in our case, for u (, 1) we have ψ 1 (u) = 2 log(1/u), which explains the modulus of continuity in (2.18). The same computation shows that the random variable Y can be taken as Y = C log B. A useful example is when p(u) = u ε, that is, if we assume that a Gaussian process satisfies E{ X t X s 2 } C t s 2ε. Then (2.18) says that X(t) is Hölder continuous for any exponent smaller than ε. 3 Stochastic Integrals In order to talk about the weak solutions of the stochastic partial differential equations, we need to define carefully stochastic integration with respect to white noises. The Ito integral Before talking about stochastic integrals with respect to space-time white noises Ẇ (t, x), let us very briefly recollect the steps done in the definition of the Ito integration, which is what we will generalize to higher dimensions. When we define the Ito integral with respect the Brownian motion t f(s, ω)db s, this is first done for elementary functions of the form f(t, ω) = X(ω)χ [a,b] (t). Here, X(ω) is an F a -measurable function recall that this innocent sounding assumption is absolutely essential for the Ito integral (as opposed to the Stratonovich and other stochastic integrals) to be a martingale, with a finite second moment: E[X 2 ] < +. 11

For such elementary functions we define the Ito integral as t, if < t < a, f(s, ω)db s = X(ω)(B t B a ), if a < t < b, X(ω)(B b B a ), if b < t. (3.1) This may be written more succinctly as t f(s, ω)db s = X(ω)[B t b B t a ]. (3.2) This expression can be immediately generalized to simple functions these are linear combinations of elementary functions f j f(t, ω) = n c j f j (t, ω), j=1 with deterministic constants c j. The main observation that allows to go further and define the Ito integral for more general functions is the Ito isometry: it is easy to check from the above definition that for a simple function f(t, ω) we have ( t ) 2 E f(s, ω)db s = E(X 2 (ω))(t b t a) = t E(f 2 (s, ω))ds. (3.3) Then one can verify that simple functions are dense in L 2 (Ω [, T ]) and define the stochastic integral for all such functions as an object in L 2 (Ω). This is also where we will construct the stochastic integral with respect to space-time white noises. Another important aspect of the Ito integral is that, as it is defined for F t -adapted functions f, the integral I t = is a martingale. Its quadratic variation is t f(s, ω)db s This follows from the Ito formula: I, I t = t f 2 (s, ω)ds. d(i 2 t ) = f 2 (t, ω)dt + 2I t f(t, ω)db t. Martingale measures In order to construct the stochastic integral with respect to a white noise, we need the notion of a σ-finite L 2 -valued measure. This is defined as follows. Let E be a subset of d, and B be an algebra of measurable sets. Typically, unless specified otherwise, we will take E as d and B as the collection of the Borel sets. Let U be a real-valued random set function on B. We denote by U(A) 2 = (E(U 2 (A)) 1/2. 12

Assume there exists an increasing sequence of measurable sets E n such that E = n E n, and sup{ U(A) 2 : A E n } < +, for all n. Then we say that the function U is σ-finite. It is countably additive if, for each n, given that A j E n and A j is a decreasing sequence of sets with an empty intersection, then lim U(A j) =. j + Then we say that U is a σ-finite L 2 -valued measure. It is easy to see that the white noise is an example of an L 2 -valued σ-finite measure, as Ẇ (A) 2 = A 1/2. We will now split one variable in the noise, and call it t, while keeping all other variables as spatial variables. A process M t (A), A B is a martingale measure if (i) M (A) = a.s., for all A B. (ii) For t > fixed, M t (A) is a σ-finite L 2 -valued measure. (iii) For all A B fixed, M t (A) is a mean-zero martingale. Let us check that the white noise process W t (A) = Ẇ ([, t] A) is a martingale measure. First, we have W (A) = a.s. because E[W t (A)] 2 = t A. This also implies that W t (A) is a σ-finite L 2 -valued measure. Finally, to see that W t (A) is a martingale for each A B fixed, we observe that for all t s u we have E[(W t (A) W s (A))W u (A)] = E[(Ẇ ([, t] A) Ẇ ([, s] A))Ẇ ([, u] A)] = u A u A =. It follows that the increment W t (A) W s (A) is independent of F s, the σ-algebra generated by W r (A), with r s, A B. Hence, we have E(W t (A) F s ) = E(W t (A) W s (A) F s ) + W s (A) = E(W t (A) W s (A)) + W s (A) = W s (A), and W t (A) is a martingale. 13

The stochastic integral for simple functions In order to define the stochastic integration we begin with the simple functions, as for the Ito integral. We say that a function f(t, x, ω) is elementary if it has the form f(t, x, ω) = X(ω)1 (a,b] (t)1 A (x). (3.4) Here, A is a Borel set, and the random variable X is bounded and F a -measurable the latter condition is very important, as it was for the Ito integral. A simple function is a linear combination of finitely many elementary functions (with deterministic coefficients). We will denote by P the σ-algebra generated by all simple functions. It is called the predictable σ- algebra. Given an elementary function f, we define the stochastic-integral process of f as (f M) t (B)(ω) = X(ω)[M t b (A B) M t a (A B)](ω). (3.5) On the informal level, this agrees with t t fdm(s, x) = X(ω) χ (a,b] (s)χ A (x)dm(s, x) B B, if < t < a < b, = X(ω)[M t (A B) M a (A B)], if < a < t < b, X(ω)[M b (A B) M a (A B)], if < a < b < t, This is a direct generalization of (3.1)-(3.2) for the Ito integral. We can extend the definition (3.5) to simple functions in a straightforward way as linear combinations. Note that if f is a simple function and M t (A) is a martingale measure then f M t is a martingale measure as well. Let us now compute the second moment of the stochastic integral process of an elementary function (3.4). We take a Borel set B and find ] ] E [((f M t )(B)) 2 = E [X 2 [M t b (A B) M t a (A B)] 2 (3.6) = E[X 2 M 2 t b(a B)] + E[X 2 M 2 t a(a B)] 2E[X 2 M t b (A B)M t a (A B)]. Let us recall that X is F a -measurable. Hence, by the definition of the quadratic variation we have [ ] E X 2 (Mt b(a 2 B) M(A B), M(A B) t b ) (3.7) [ ] = E X 2 (Mt a(a 2 B) M(A B), M(A B) t a ), and, since M t is martingale measure, we also have [ [ ] E X 2 (M t b (A B)M t a (A B) = E X 2 (Mt a(a 2 B)). (3.8) Using this in (3.6) gives ] [ ] E [((f M t )(B)) 2 = E X 2 (Mt a(a 2 B) M(A B), M(A B) t a ) (3.9) [ ] [ ] +E X 2 M(A B), M(A B) t b ) + E[X 2 Mt a(a 2 B)] 2E X 2 (Mt a(a 2 B)) [ ] = E X 2 ( M(A B), M(A B) t b M(A B), M(A B) t a ). (3.1) 14

Let us define the covariance functional of a martingale measure M t as Q t (A, B) = M(A), M(B) t. (3.11) Note that Q t (A, B) is symmetric in A and B: Q t (A, B) = Q t (B, A). Moreover, if B and C are disjoint sets, then Q t (A, B C) = M(A), M(B C) t = M(A), M(B) + M(C) t = M(A), M(B) t + M(A), M(C) t = Q t (A, B) + Q t (A, C). Finally, it satisfies the Cauchy inequality: Q t (A, B) 2 Q t (A, A) Q t (B, B). With this notation, going back to (3.9), we can write ] ( ) E [((f M t )(B)) 2 = E X 2 [Q t b (A B, A B) Q t b (A B, A B)]. (3.12) In other words, for simple function we have [ t ] 2 [ t E fdm(s, x) = E Here, we have defined B B B ] f(s, x)f(s, y)q(dxdyds). (3.13) Q([s, t] A B) = Q t (A, B) Q s (A, b). (3.14) Hence, the potential majorizer for the L 2 -estimate on the stochastic integral is Q(dxdyds). As the quadratic variation has bounded total variation, it is a good integrator. For the white noise we may compute the quadratic variation explicitly: take a set A B, and set W t (A) = Ẇ ([, t] A). We claim that in this case Indeed, let X(ω) be F s measurable, and consider, for t > s: Q t (A, B) = t A B. (3.15) E((W t (A) 2 t A )X) = E[((W s (A)) 2 s A )X] + E[((W t (A) W s (A)) 2 (t s) A )X] +2E[W s (A)(W t (A) W s (A))X)] = E[((W s (A)) 2 s A )X], hence Q t (A, A) = t A. (3.16) On the other hand, for disjoint A and B, we know that Q t (A, B) =, (3.17) 15

since M t (A) and M t (B) are martingales that have increments independent of each other, so that M t (A)M t (B) is also a martingale. It follows that for a general pair of sets A and B we may write Q t (A, B) = M(A), M(B) t = (M(A \ B) + M(A B)), (M(B \ A) + M(A B) t = M(A B), M(A B) t = t A B, which is (3.15). The measure (3.14) that corresponds to the white noise is, therefore, simply This may be written formally as Q([s, t] A B) = t s A B. (3.18) Q(dxdydt) = δ(x y)dxdydt, (3.19) that was our intent from the very beginning! The general construction of a stochastic integral proceeds for the so-called worthy measures. These are the measures such that Q(dxdydt) defined via (3.14) can be majorized by a positive-definite signed measure K(dxdydt). We will for simplicity of notation restrict ourselves the stochastic integral with respect to the white noise. The stochastic integral for predictable functions Let us recall that we denote by P the σ-algebra generated by the simple functions. A function is predictable if it is P-measurable. We can define the norm for predictable functions (keep in mind that we are only considering the white noise case here) as ( T f 2 = E ) f(t, x, ω) 2 dxdt. (3.2) d We will denote by P 2 the space of predictable function of a finite norm (3.2). It is an exercise to verify that P 2 is a Banach space. Another exercise shows that the simple functions are dense in P 2. Let us go back to (3.13), which for the white noise takes the form [ t E B ] 2 [ t f(s, x, ω)w (ds, dx) = E B B ] f(s, x, ω) 2 dxds. (3.21) Here, f is a simple function but this allows us to generalize the notion of the stochastic integral to functions in P 2. Indeed, if f n is a Cauchy sequence of simple functions in P 2, then (3.21) shows that the sequence t f(s, x, ω)w (ds, dx) (3.22) B is Cauchy in L 2 (P ). Hence, for any function f P 2 we may define the stochastic integral (3.22) as the limit in L 2 (P ) of the t f n (s, x, ω)w (ds, dx), (3.23) B where f n is a sequence of simple functions in P 2 that converges to f in P 2. This is essentially the same procedure as in the definition of the usual Ito integral. 16

4 The stochastic heat equation with a Lipschitz nonlinearity: the basic theory We now consider a very basic example of a parabolic SPDE t = 2 u + f(u)ẇ, (4.1) x2 posed on, with the initial condition u(, x) = u (x). The function u (x) is deterministic and compactly supported. The nonlinearity f(u) is globally Lipschitz: f(u 1 ) f(u 2 ) K u 1 u 2. (4.2) This assumption is extremely important both for u small and u large: the interesting cases are what happens when f(u) u for small u this will lead to compactly supported solutions, and when f(u) u 2 for u large this may lead to blow-up of solutions in a finite time. For now, we deliberately avoid both, and stay within the realm of Lipschitz nonlinearities for simplicity, but will come back to them later. It is sometimes helpful to assume that f is, in addition, bounded but we will avoid this for the moment, as we would like to include the standard stochastic heat equation t = 2 u + uẇ (t, x). (4.3) x2 On an intuitive level, the noise acts as a huge and very irregular force in the heat equation, making even very familiar properties of the solutions of the heat equation somewhat nonobvious. For example, since Ẇ can be huge and positive may that bring about growth at infinity that would knock the solution out of the L p () space? On the other hand, as the noise can be very negative, a priori it is by no means obvious that the strong maximum principle would hold: given that u (x), do we know that u(t, x) >? As the reader will see, getting the answers even to these questions will require some non-trivial arguments. We should also stress that the restriction to one spatial dimension is not technical or accidental, as the solutions have a very different nature in dimension d > 1. The mild solutions: existence and uniqueness Let us first make precise which notion of a solution we will use. The Duhamel formula tells us that if the noise were smooth, the solution of (4.1) would have the form t u(t, x) = G(t, x y)u (y)dy + G(t s, x y)f(u(s, y))dw (s, y), (4.4) and this will be our starting point. That is, a solution of (4.1) is a solution of (4.4) that is adapted to the σ-algebra F t generated by the white noise Ẇ. Here, G(t, x) is the standard heat kernel: 1 G(t, x) = /(4t). (4.5) (4πt) 1/2 e x 2 These are also known as mild solutions. 17

Theorem 4.1 The stochastic heat equation (4.1) with a globally Lipschitz nonlinearity f(u) and a compactly supported initial condition u (x) has a unique solution u(t, x) such that for all T > we have sup x sup t T E( u(t, x) 2 ) < +. (4.6) In other words, solutions exist and are unique in the space P 2, [, T ] with the norm u 2 P 2, = sup x sup t T E( u(t, x) 2 ). (4.7) Note that the stochastic integral in the right side of (4.4) makes sense for all functions in P 2, [, T ]. The proof of uniqueness We first prove uniqueness. Suppose that u and v are two mild solutions of (4.1) or, equivalently, of (4.4) in P 2, [, T ]. We will show that v is a modification of u. Set and write z(t, x) = t The Ito isometry implies that We set and get Note that E( z(t, x) 2 ) = It follows from (4.1) that t K = K z(t, x) = u(t, x) v(t, x), G(t s, x, y)[f(u(s, y)) f(v(s, y))]w (dyds). (4.8) H(t) K t t ] G 2 (t s, x, y)e [ f(u(s, y)) f(v(s, y)) 2 dxds (4.9) ] G 2 (t s, x, y)e [ u(s, y) v(s, y) 2 dxds ] G 2 (t s, x, y)e [ z(s, y) 2 dxds. H(t) = sup sup E( z(s, x) 2 ), s t x t G 2 (s, y)dy = C s G 2 (t s, x, y)h(s)dxds. (4.1) H(t) K t 18 e y 2 /(4s) ds = C s. H(s) ds. (4.11) t s 1/2

Hölder s inequality with any p (1, 2) and implies that 1 p + 1 q = 1, (4.12) t H(t) q K H(s) q ds. (4.13) Grownwall s inequality implies now that H(t) = for almost all s, hence u and v are modifications of each other. The existence proof The proof is via the usual Picard iteration scheme. We let u (t, x) = u (x) and define iteratively t u n+1 (t, x) = G(t, x y)u (y)dy + G(t s, x y)f(u n (s, y))dw (s, y). (4.14) It is easy to verify that all u n are in P 2, [, T ]. The increment satisfies q n (t, x) = t q n (t, x) = u n+1 (t, x) u n (t, x) G(t s, x y)(f(u n (s, y)) f(u n 1 (s, y))dw (s, y). (4.15) As f is Lipschitz, it follows that t E( q n (t, x) 2 ) = G 2 (t s, x y)e(f(u n (s, y)) f(u n 1 (s, y)) 2 dyds (4.16) t K 2 G 2 (t s, x y)e q n 1 (s, y) 2 dyds. Hence, the function satisfies and thus Z n (t) = sup x t Z n (t) K 2 Z n (t) C sup s t E q n (s, x) 2 G 2 (t s, x y)z n 1 (s)dyds, (4.17) t 19 Z n 1 (s) ds. (4.18) t s 1/2

Once again, with p (1, 2) and q as in (4.12) we get Hence, Gronwall s lemma implies that As a consequence, we get Z n (t) q C t Z n (t) q C 1 (Ct) n 1 (n 1)!. n= Zn 1/2 (t) < +. Z n 1 (s) q ds. (4.19) It follows that the sequence u n (t, x) converges in P 2, [, T ] to a limit u(t, x). The same argument based on the Ito isometry and the global Lipschitz bound on the function f implies that t t G(t s, x y)f(u n (s, y))dw (s, y) G(t s, x y)f(u(s, y))dw (s, y), also in P 2, [, T ]. We conclude that u(t, x) is a solution to u(t, x) = G(t, x y)u (y)dy + finishing the existence proof. Higher moments of the solutions t G(t s, x y)f(u(s, y))dw (s, y), (4.2) One may ask if the solutions of the stochastic heat equation (3.16) that we have constructed lie in better spaces, such as P s, with the norm u s P s, = sup sup x, t T E( u(t, x) s ), (4.21) and s > 2. We will not prove existence and uniqueness of the solution in P s, with s > 2 but rather estimate its norm in this space. The solution can be constructed as a combination of the Picard iteration and very similar arguments. We will need Burkholder s inequality. Theorem 4.2 [Burkholder s inequality] Let N t be a continuous martingale such that N =, then for each p 2 we have E N t p c p E( N, N t ) p/2, (4.22) with a constant c p > that depends only on p. As a consequence of Burkholder s inequality, for any predictable function f we have [ t ] p E f(s, x)dw (s, x) cp E[ t ] p/2. f(s, x) 2 dsdx (4.23) d d 2

Thus, the moments of the solution of the stochastic heat equation t u(t, x) = G(t, x y)u (y)dy + G(t s, x y)f(u(s, y))dw (s, y) (4.24) can be estimated as [ t s/2 M s (t, x) := E u(t, x) s C + CE G 2 (t s, x y)f 2 (u(s, y))dsdy] (4.25) [ t s/2. C + CE G 2 (t s, x y)u 2 (s, y)dsdy] Here we have used the Lipschitz property of f. Let us assume for simplicity that s = 4, then we can write [ t 2 M 4 (t, x) C + CE G 2 (t s, x y)u 2 (s, y)dsdy] (4.26) t t = C + C G 2 (t s, x y)g 2 (t s, x y )E[u 2 (s, y)u 2 (s, y )]dsdyds dy. Set M 4 (t) = sup M 4 (t, x), x then we have, using the Cauchy inequality t t M 4 (t) C + C G 2 (t s, y)g 2 (t s, y ) ( t C + C M 4 (s) 1/2 ds ) 2 ( t C + C t s 1/2 with any q > 2. Gronwall s lemma implies now that 1/2 1/2 M 4 (s) M 4 (s )dsdyds dy M q/2 4 (s)ds) 2/q, (4.27) sup t T M 4 (t) C T. (4.28) This argument can be generalized to all even integers s and from there to all s < +. It is straightforward to adapt it to show the existence of the solutions in P s, [, T ] via Picard s iteration. Uniqueness of the solutions in P s, [, T ] follows immediately from the uniqueness result in P 2, [, T ] that we have already proved. Exercise 4.3 With a little more careful analysis one may show the following bound: there exists a constant C > that depends only the Lipschitz consant of f and u L so that sup E( u(t, x) k ) C k e Ck3t. (4.29) x 21

The spatial L 2 -bound Let us now consider the unique P 2, [, T ]-solution to t u(t, x) = G(t, x y)u (y)dy + G(t s, x y)f(u(s, y))dw (s, y), (4.3) and ask if we may expect the L 2 -norm in space of u(t, x) to remain bounded recall that we assumed that u (x) is compactly supported (though this assumption can be easily weakened to u L 1 () in the existence and uniqueness proofs). Clearly, this is not true just under the assumption that f(u) is Lipschitz: if we take f 1 and consider the solutions of t = 2 u + Ẇ (t, x), x2 then there is no reason to expect that the solution has any spatial decay whatsoever. Let us, therefore, assume that, in addition to being globally Lipschitz, f(u) satisfies f() =, so that f(u) K u. (4.31) The first integral in the right side of (4.3) is obviously in any L p (), 1 p +, hence we only look at t U(t, x) = G(t s, x y)f(u(s, y))dw (s, y), and compute t E u 2 (t, x)dx 2 u (x) 2 dx + 2 G 2 (t s, x y)e[f 2 (u(s, y))]dydsdx t 2 u (x) 2 dx + 2K G 2 (t s, x y)e[ u(s, y) 2 ]dydsdx (4.32) t 2 u 2 2 + C E[ u(s, y) 2 ds ]dy t s. 1/2 Thus, satisfies Z(t) = E u(t, x) 2 dx Z(t) 2Z() + C Hence, for any q (2, + ) and t T, we have t Z(s)ds t s 1/2. Gronwall s lemma implies that t Z q (t) C + C T Z q (s)ds. sup Z(t) C T, t T 22

and we conclude that for every T > we have sup t T E u(t, x) 2 dx < +. (4.33) In the special case of the stochastic heat equation t = 2 u + uẇ, (4.34) x2 with the initial condition u(, x) = u (x), we have t u(t, x) = v(t, x) + G(t s, x y)u(s, y)w (dsdy). (4.35) It follows that satisfies a closed equation Z(t, x) = v 2 (t, x) + Z(t, x) = E u(t, x) 2 Here, the function v(t, x) is the solution of the heat equation t G 2 (t s, x y)z(s, y)dy. (4.36) v t = 2 v x, (4.37) 2 with the initial condition v(, x) = u (x). Hence, the L 2 -norm of Z(t, x), Z(t) = E u(t, x) 2 dx satisfies Z(t) = v(t) 2 L 2 + b t with an explicit constant b >. Let us define the function Z γ (t) = e γt Z(t), with the constant γ > to be chosen. Then Z γ (t) satisfies with Let us choose so that Z γ (t) = a(t) + t a(t) = v(t) 2 L 2e γt, g(t) = be γt t s. Z(s) t s ds, (4.38) g(t s)z γ (s)ds, (4.39) γ = πb 2, (4.4) g(s)ds = 1. (4.41) This is, clearly, a necessary condition for Z γ (t) to have a limit as t +, since a(t) as t +. 23

Exercise 4.4 Show that with this choice of γ and this a(t), the limit exists. In order to find the limit, let us write with β(t) as t + : so that Integrating (4.44) gives t β(s)ds = Z γ = lim t + Z γ(t) (4.42) Z γ (t) = Z γ + β(t), t Z γ + β(t) = a(t) + Z γ g(t s)ds + t β(t) = a(t) Z γ g(s)ds + a(s)ds Z γ t t s t g(s )ds ds + t g(t s)β(s)dy, (4.43) g(t s)β(s)dy. (4.44) t s g(s s )β(s )ds ds. (4.45) The long time limit of the second integral in the right side can be computed as t s g(s )ds ds = t s g(s )ds + t t g(s )ds while for the last integral in the right side of (4.45) we have t s = g(s s )β(s )ds ds β(s ) s g(s s )dsds = s sg(s)ds, as t +, (4.46) g(s s )β(s )ds ds (4.47) β(s)ds, as t +. Going back to (4.45) we conclude that ( ) 1 Z γ = sg(s)ds a(s)ds. (4.48) Therefore, the solution of the stochastic heat equation (4.34) satisfies with γ > given by (4.4). E u(t) 2 L 2 Z γ e γt, as t +, (4.49) Exercise 4.5 Generalize this argument to the solutions of equations of the form t = 2 u + f(u)ẇ, (4.5) x2 with a nonlinearity f(u) such that c 1 u f(u) c 2 u. Obtain a lower and upper bound for the L 2 -norm of the solutions as in (4.49). 24

The exponential growth of the second moment in (4.49) should be contrasted with the simple bound on the integral: E u(t, x)dx = u (x)dx. (4.51) We will discuss this again when we talk about the intermittency of the solutions. oughly, the disparity of the L 1 and L 2 norms of the solutions indicates that there are small islands where the solution is huge. We should also note that we will later show that u(t, x) > if u (x) and u does not vanish identically. Hence, the integral in the left side of (4.51) is the L 1 -norm of u. The Hölder regularity of the solutions In order to study the Hölder regularity of the solutions, let us first make a slightly simplifying assumption that in addition to being Lipschitz, the function f(u) is globally bounded: sup f(u) K. (4.52) u We will later explain how this assumption can be removed, using the P s, [, T ] bounds on the solution with s (2, + ), rather than just the bounds in P 2, [, T ] that we will use in the proof. Theorem 4.6 There exists a modification of the solution of (4.4) that is Hölder continuous in x of any order less than 1/2 and in t of any order less than 1/4. We will need in the proof a slight generalization of the Kolmogorov continuity criterion compare this to Theorem 2.2. Theorem 4.7 Let X t, t T = [a 1, b 1 ]..., [a d, b d ] d be a real-valued stochastic process. Suppose there are constants k > 1, C > and α i >, i = 1,..., d, so that q := d i=1 1 α i < 1, and for all s, t T, we have E( X(t) X(s) k ) C d t i s i α i. (4.53) i=1 Then X(t) has a continuous modification X(t). Moreover, X(t) is Hölder continuous in each variable t i with any exponent γ (, α i (1 q)/k). Note that when all α i = α, then the assumptions require α > d, and the process has the Hölder exponent less than α(1 d α ) 1 k = α d, k which is exactly Theorem 2.2. We will leave the proof as an exercise. 25

The proof of Theorem 4.6 Let us consider U(t, x) = t G(t s, x y)f(u(s, y))dw (s, y). (4.54) We need to show that U(t, x) has the required Hölder continuous modification. For t t we write t U(t, x) U(t, x) = [G(t s, x y) G(t s, x y)]f(u(s, y))dw (s, y) + t t G(t s, x y)f(u(s, y))dw (s, y). (4.55) Young s and Burkholder s inequalities imply that [ t E U(t, x) U(t, x) p C p E G(t s, x y) G(t s, x y) 2 f 2 (u(s, y))dyds [ t + C p E t ] p/2 G 2 (t s, x y)f 2 (u(s, y))dyds] p/2 = I + II. (4.56) Using the assumption that f(u) K, the second term can be estimated as [ t ] p/2 [ t II C G 2 (t ds ] p/2 s, x y)dyds C C t t p/4. (4.57) t t t s 1/2 For the first term in the right side of (4.56) we write, using the Plancherel identity G(t s, x y) G(t s, x y) 2 dy = G(t s, y) G(t s, y) 2 dy (4.58) = C e (t s) ξ 2 2 e (t s) ξ 2 dξ = C e 2(t s) ξ 2[ 1 e (t t) ξ 2] 2 dξ. It follows that the first integral in there right side of (4.56) can be bounded as t I 2/p C e 2(t s) ξ 2[ 1 e (t t) ξ 2] 2 1 ( dξds = C 1 e 2t ξ 2)[ 1 e (t t) ξ 2] 2 dξ. ξ 2 (4.59) Now, we use the following two elementary estimates: first, there exists C T > so that for all t T and all ξ we have 1 ( 1 e 2t ξ 2) C T ξ 2 1 + ξ, 2 and, second, 1 e (t t) ξ 2 2 min[(t t) ξ 2, 1]. Using these estimates in (4.59) gives I 2/p 1 C 1 + ξ 2 min[(t t) ξ 2, 1]dξ = C T t t 1/2 (t t) ξ 2 1 + ξ 2 dξ + C T 26 t t 1/2 dξ 1 + ξ 2 C T t t 1/2.

We conclude that Exercise 4.8 Show that ( t E U(t, x) U(t, x ) p C p E U(t, x) U(t, x) p C T t t p/4. (4.6) G(t s, y) G(t s, x x y) 2 dyds) p/2, (4.61) and then use a similar computation to what we have done to show that Summarizing, we have E U(t, x) U(t, x ) p C T x x p/2. (4.62) E U(t, x ) U(t, x) p C T ( t t p/4 + x x p/2 ). (4.63) Now, we use Theorem 4.7 with k = p, α x = p/2 and α t = p/4, so that q = 1 α x + 1 α t = 2 p + 4 p = 6 p, so we get Hölder continuity in t with any exponent smaller than γ t = α t(1 q) = 1 ( 1 6 ), p 4 p and in x with any exponent smaller than γ t = α x(1 q) p = 1 2 ( 1 6 ). p As p > 2 is arbitrary, it follows that u(t, x) is Hölder continuous in x with any exponent smaller than 1/2 and in t with any exponent smaller than 1/4. Exercise 4.9 Use the bounds on the higher moments E u(t, x) p to improve the argument above to show that the almost sure Hölder regularity of u(t, x) with the same exponents holds under the (weaker) assumption that the function f(u) is Lipschitz rather than bounded, removing assumption (4.52). The comparison principle It is well known if a function f(u) is Lipschitz and f() =, then the parabolic equations of the form = u + f(u), (4.64) t satisfy the comparison principle. That is, if u(t, x) and v(t, x) are two solutions of (4.64) and u(, x) v(, x) for all x then u(t, x) v(t, x) for all t and x. Moreover, the strong comparison principle says that actually u(t, x) < v(t, x) for all t > and x provided that u(, x) v(, x). These results are easily generalized to equations of the form t = u + g(t, x)f(u), (4.65) 27

with a regular function g(t, x). Here, we prove the following comparison theorem for the solutions of the stochastic heat equation t = 2 u + f(u)ẇ, (4.66) x2 with a Lipschitz nonlinearity f(u). The difference with the classical PDE results is that the noise Ẇ is highly irregular. The canonical PDE proof with a bounded function g(t, x) relies on the fact that the Hessian of a function at a minimum is non-negative definite matrix. Here, we can not use this strategy since the solutions are merely Hölder with exponent less than 1/2 in space. Theorem 4.1 Let u(t, x) and v(t, x) be two solutions of (4.66) such that u(, x) v(, x). Then, almost surely, we have u(t, x) v(t, x) for all t and x. Note that we do not yet claim the strong comparison principle, which says that u(t, x) > v(t, x) for all t > and x unless u (x) v (x). This will be done slightly later. The idea of the proof of Theorem 4.1 is to use numerical analysis. We construct approximate solutions u n (t, x) and v n (t, x) such that almost surely we have u n (t, x) v n (t, x) for all t and x and then pass to the limit n +. The approximation is done by time-splitting in time and discretizing space. This is also an alternative way to construct the solutions of the original SPDE. The time splitting schemes Solution of a linear equation of the form is given by du dt If the linear operators A and B commute then we have This means that we can solve first = (A + B)u, (4.67) u(t) = e (A+B)t u. (4.68) u(t) = e At v(t), v(t) = e Bt u. (4.69) dv dt = Bv, v() = u, t T, followed by du = Au, u() = v(t ), t T, dt and obtain the correct u(t ). When the operators A and B do not commute, one relies on the Trotter formula e (A+B)t = lim n + (ea/n e B/n ) n. (4.7) 28

The corresponding time-splitting scheme proceeds as follows. We divide the time axis t > into intervals of the form { j T nj = n t < j + 1 }. (4.71) 2 n 2 On each time interval T nj we first solve followed by v t = Bv, v( j n 2 ) = u( j n 2 ), j n 2 t j + 1 n 2, (4.72) t = Av, u( j n ) = v(j + 1 ), 2 n 2 j n t j + 1, (4.73) 2 n 2 which gives us u((j + 1)/n 2 ), and we can solve (4.72) on the time interval T n,j+1, and so on. Convergence of u(t) to the solution of (4.67) is guaranteed by the Trotter formula under certain assumptions on A and B. The spatial discretization and time splitting for the stochastic heat equation For the stochastic heat equation t = 2 u + f(u)ẇ, (4.74) x2 we would like to consider the following time-splitting scheme: the first step is solving a pointwise SDE n,j+1/2 = f(u n,j+1/2 )Ẇ t, j n t < j + 1, (4.75) 2 n 2 with the initial condition u n,j+1/2 ( j n, x) = u n,j( j 2 n, x), 2 followed by the heat equation with the initial condition n,j+1 t = 2 u n,j+1 x 2, u n,j+1 ( j n, x) = u n,j+1/2( j + 1, x). 2 n 2 j n t < j + 1, (4.76) 2 n 2 This would give us the initial condition u n,j+1 ((j + 1)/n 2, x) for the next SDE step (4.75) on the time interval T n,j+1, and we would be able to re-start. One difficulty is making sense of (4.75) as it is neither an SDE nor an SPDE. Hence, in addition to the time-splitting, we will discretize in space. We will consider functions u nj (t, x) that are piecewise constant on the spatial intervals I nk = { k n 1 2n x < k n + 1 }. 2n 29

Each u nj is defined on the time interval T nj. Given u nj (j/n 2, x), in order to define u n,j+1/2 and u n,j+1, we first solve a family of SDEs u n,j+1/2 (t, k n ) = u nj( j n 2, k n ) + n t j/n 2 f(u n,j+1/2 (s, k ))W (dsdy). (4.77) I nk n In other words, on the time interval T nj, the piece-wise constant in space function u n,j+1/2 (t, x) satisfies the SDE where du n (t, k n ) = f(u n(t, k n ))db k, (4.78) B k (t) = n t I nk W (dsdy) is the standard Brownian motion. In order to incorporate the heat equation step, we consider the discrete Laplacian n u( k n ) = n2 [ u( k + 1 n ) + u(k 1 n ) 2u(k n ) ]. The function u n,j+1 (t, x), also defined on the time interval T nj is the solution of n,j+1 t = n u n,j+1, t T nj, (4.79) with the initial condition u n,j+1 (j/n 2, x) = u n,j+1/2 ((j + 1)/n 2, x). It is convenient to re-write the above scheme in terms of the Green s function G n (t, x, y) of the discrete Laplacian. It is defined for the lattice points of the form x = k/n, y = m/n, and is the solution of G n = n G n, (4.8) t with the initial condition We extend G n (t, x, y) to x, y as G n (, k n, m n ) = { n, if k = m,, otherwise. G n (t, x, y) = G n (t, k n, m n ), if k n 1 2n x < k n + 1 2n and m n 1 2n y < m n + 1 2n, Let us now verify that the approximation u nj (t, x) that we have defined above via the time-splitting scheme is the solution of (dropping sub-script j) u n (t, x) = with the initial condition Ḡ n (t,, x, y)u n (, y)dy + u n (, x) = n k+1/(2n) k 1/(2n) t Ḡ n (t, s, x, y)f(u n (s, y))w (dsdy), (4.81) u(, y)dy, for k n 1 2n x < k n + 2 2n, 3

and Ḡ(t, s, x, y) defined as ( [n 2 t] [n 2 s] ) Ḡ n (t, s, x, y) = G n, x, y, n 2 for all t s. Indeed, given t T nj we have [n 2 t] = j, thus Ḡn(t,, x, y) = G n (j/n 2, x, y). In addition, for s j/n 2 we have Ḡ(t, s, x, y) = Ḡ(j/n2, s, x, y). Hence we may re-write (4.81) as u n (t, x) = u n ( j n 2, x) + t j/n 2 Next, for x I nk, t T nj, and j/n 2 s t, we have Ḡ n (t, s, x, y)f(u n (s, y))w (dsdy). (4.82) Ḡ n (t, s, x, y) = G n (, k n, y) = n1 I nk (y). Hence, (4.82) says u n (t, x) = u n ( j t n, x) + n f(u 2 n (s, k ))W (dsdy). (4.83) I nk n j/n 2 Therefore, on the time interval T nj, the piece-wise constant (in space) function u n (t, x) satisfies the SDE (4.78). At the time t = (j + 1)/n 2 the function u n (t, x) experiences a jump. To describe it, we go back to (4.81): the solution after the jump is given by u n ( j + 1 n 2 +, x) = while just before the jump we have u n ( j + 1, x) = n 2 + (j+1)/n 2 + Ḡ n ( j + 1,, x, y)u n 2 n (, y)dy (4.84) Ḡ n ( j + 1, s, x, y)f(u n 2 n (s, y))w (dsdy), (j+1)/n 2 The semi-group property for (4.8) implies that G n (t, k n, p n ) = 1 n Ḡ n ( j n 2,, x, y)u n(, y)dy (4.85) m Ḡ n ( j n 2, s, x, y)f(u n(s, y))w (dsdy). G(t s, k n, m n )G(s, m n, p ), (4.86) n for all s t, k and p. The continuous version of (4.86) is G n (t, x, y) = G n (t s, x, z)g n (s, z, y)dz. (4.87) 31

It follows from (4.87) that for s < (j + 1)/n 2 we have Ḡ n ( j + 1, s, x, y) = G n 2 n ( j [n2 s] + 1, x, y) = n 2 n2 G n ( 1 n, x, z)ḡn( j, s, z, y)dz. 2 n2 Using this in (4.84), together with the semi-group property gives u n ( j + 1 +, x) = G n 2 n ( 1 n, x, z)ḡn( j 2 n, z, y)u n(, y)dydz (4.88) 2 (j+1)/n 2 + G n ( 1 n, x, z)ḡn( j 2 n, s, z, y)f(u n(s, y))w (dsdy)dz 2 = G n ( 1 n, x, z)u n( j + 1, z)dz. 2 n 2 We see that, indeed, the passage from u n at the time t = (j +1)/n 2 to u n at t = (j +1)/n 2 + is exactly via solving the discrete heat equation (4.79). Convergence of the approximation We will need the result of the following exercise, verified by a lengthy computation found in [4]. Exercise 4.11 Show that the following two bounds hold: first, t [Ḡn(t, s, x, y) G(t s, x y)] 2 dsdy c n, (4.89) and, second, sup s t [ 2 sup [Ḡn(t, s, x, y) G(t s, x y)]u(, y)dy] as n +. (4.9) x We now show that M(t) := sup sup E u n (s, x) u(s, x) 2 as n +. (4.91) s t x To see this, let us recall that u(t, x) = G(t, x y)u(, y)dy + and u n (t, x) = Subtracting, we get Ḡ n (t,, x, y)u n (, y)dy + t t G(t s, x y)f(u(s, y))w (dsy), Ḡ n (t, s, x, y)f(u n (s, y))w (dsdy), M(s, x) = E u n (s, x) u(s, x) 2 C(I + II), (4.92) 32

with I = G(t, x y)u(, y)dy Ḡ n (t,, x, y)u n (, y)dy 2, as n +, because of (4.9) and the fact that u n (, y) converges to u(, y) in every L p -norm. The other term is t II = E G(t s, x y)f(u(s, y)) Ḡn(t, s, x, y)f(u n (s, y)) 2 dsdy, and can be bounded as t II C E (G(t s, x y) Ḡn(t, s, x, y))f(u n (s, y)) 2 dsdy t + C E G(t s, x y)(f(u(s, y)) f(u n (s, y))) 2 dsdy = II 1 + II 2. (4.93) It is straightforward to verify that there exists C T so that sup s T sup E u n (s, x) 2 C T. (4.94) x This, together with (4.89) and the Lipschitz bond on f means that The last term in right side of (4.93) is bounded as II 2 C Therefore, we have an estimate for M(s): II 1 C T n. (4.95) t M(t) α(n) + with α(n) as n +. We conclude that Back to the comparison principle M(s)ds t s. (4.96) t M(s)ds t s, (4.97) M(t) as n +. (4.98) We have shown that u(t, x) and v(t, x) can be obtained via the time-splitting approximation scheme. The approximations u n (t, x) and v n (t, x) satisfy u n (t, x) v n (t, x) if u(, x) v(, x) for all x. This is because each of the steps in the time splitting scheme preserves the order. Indeed, the heat equation has the comparison principle, while an SDE du = f(u)db t 33