Exponentially Rare Large Fluctuations: Computing their Frequency

Exponentially Rare Large Fluctuations: Computing their Frequency Robert S. Maier Mathematics and Physics Departments University of Arizona rsm@math.arizona.edu Dec. 1, 2000

Abstract Random processes that model asset price evolution, such as geometric Brownian motion or more general diffusion processes with drift, may occasionally deviate radically from their expected values. The smaller the volatility, the less frequently a deviation of specified size will occur. Such occurrence rates fall off exponentially, and are hard to estimate via Monte Carlo simulation or numerical integration of diffusion equations. Large deviation theory provides a way, not based on simulation, of estimating the rate at which such rare events take place. The key concept is that of an optimal trajectory: the asymptotically most likely trajectory along which a fluctuation of specified size will occur. 1

I explain a new, enhanced version, which may be implemented numerically, and applied to multidimensional diffusion processes with drift. It employs a system of ordinary differential equations, which must be integrated along any optimal trajectory. The new technique permits the estimation not only of asymptotic exponential falloff rates, but also of the all-important pre-exponential factor: the constant that multiplies the exponential. This yields precise predictions, in the small-volatility limit at least, for the (exponentially small!) rate at which rare fluctuations occur. Tail probabilities computed from diffusion models often do not agree well with empirical data. The new technique applies, suitably modified, to jump-diffusion processes as well. So it should facilitate choosing a model that fits empirical data, even on the tails. 2

Recent Work R. S. Maier and D. L. Stein, Noise-activated escape from a sloshing potential well. Explains the theoretical and numerical computation of first passage time asymptotics for periodically modulated 1-D diffusion processes. Available at http://www.math.arizona.edu/ rsm. R. S. Maier and D. L. Stein, Limiting exit location distributions in the stochastic exit problem. Treats a boundary crossing problem for 2-D drift-diffusion processes, and the computation of limiting crossing location distributions. SIAM J. on Applied Mathematics 57 (1997), 752 790. R. S. Maier, Communications networks as stochastically perturbed nonlinear systems. Stresses that large deviation theory for discrete jump processes is not the same as for diffusion processes. Proc. 30th Allerton Conference on Communication, Control, and Computing. 3

Related Approaches Rare event probability estimation via simulation; importance sampling; optimal exponential twisting defined variationally. [ECE community; queueing theory and stochastic networks context.] Quasi-analytic weak-volatility limits for driftdiffusion processes via singular perturbation theory and matched asymptotic approximations. [Applied mathematics community, B. Matkowsky, Z. Schuss, et al., before early 1990s.] Quasi-analytic weak-volatility limits via optimal trajectory formalism, Hamilton s equations rather than Euler Lagrange equations. [Theoretical physics community, M. I. Dykman et al. and RSM; since early 1990s.] 4

% A Simple Large Deviation Example If are i.i.d., with and equal to, let (with large deviation scaling ) "! %& $# (' *)$+-, The % -indexed family of processes "!.#, #(/ 0 converges to zero as % 1 2, and naively resembles 436587 % the family of scaled Brownian motions & $#, #(/ 0. (Central limit theorem!) However, % 1 2 the central limit theorem applies as "! only in the small deviation regime: when $# and % 9365 7 & $# are an : number of standard deviations 9365 % & from their mean (zero), i.e., they are :, i.e., if 9365 :. 5

; ; ; N N N N N ; For fixed < = >, the event?(@ba*ced F G8H is an exponentially unlikely event as H I J. In terms A"MON of the rescaled family of processes K L <OP, this is A"M N the above-a-level event K L <OPQF G. F G W Naively, this is like the event HRTS4U6V8W <OP : an above-a-level event for a small-volatility Brownian motion, or a tail event <OPXF G8H S4U6V for a standard Brownian motion. A"M N But at any fixed < = >, Y K L <OP F G-P and the (seemingly similar!) Y H RTS4U6V W <OP F G-P have different exponential falloff rates as H I J. A trivial example of this: if G = Z, the first is identically zero, but the second is not. In the large deviation regime, the two families differ. 6

[ [ [ Š Š Š r i Š h p t e \ n Œ i From Cramér s Theorem to Optimal Trajectories Central Limit Theorem: \^] _9`6acbd e f as ikjmlonqpsr \ g lutkvwp h, where l~} f n is t p standard normal: ] _9`6ayxcz { a8. Cramér s Theorem: \ ]T_ b d, as \ g h, has density lutkv asymptotics that l~} are t"p not those of \ ] _9`6a f, i.e., \ ]T_9`6a xcz { a. Rather, dq qƒ satisfies: l ˆp d qƒ l4} l ˆp p8œ x z { \Š \ g h to logarithmic accuracy. (Various assumptions... ) l p is a rate function, quantifying the exponential rareness of large deviations from the mean. l p r If the Ž u are i.i.d. standard normal, then l pmr a. But if the Ž are i.i.d., then for. 7

Ê Ë š š Ç Ë ² Ë Ì ² Ñ Ë Ë Rate function result: š œ is the Legendre transform of the cumulant generating function of any of the i.i.d. increments ž Ÿ. ±³²µ š «ªc T Ÿ ±³²µ š ª T B Ÿ ±À²µ œˆ š O¹º ¾ œ»µ¼*½ B Á ÃÂ For example, if Ÿ is standard normal then ±À²µ Ä (š Å8ÆÇ and œ Èš œˆåéæç. But if Ÿ takes values with equal probability, then ±À²µ Ä š Ì OÍÎ ÐÏ ±À²µ Ä ÍÎ ÐÏ ËXÑ ±³²µ œ À œ (provided ÒÓœÒÕÔ ËXÑ œ ; otherwise œ Öš Ñ œˆ ). ±À²µ œ Á 8

Ø ô Ø Generalization to the random processes Ù ÚÀÛ"Ü Ý$Þ ß? (Wentzell Freidlin and Donsker Varadhan, 1970s.) A large deviation principle for the associated measures states that the probability that Ù ÚÀÛ"Ü Ý$ÞOß tracks any specified trajectory Þ àá ÙÝ$Þ ß, uniformly over Þ(â ãåä æèçêé, has asymptotics ë ì í î ïñð ò ó Ýºõ ÙöÝ.Þ ß ßùøµÞ æ ð á ú æ to logarithmic accuracy. ÙÝoäûß ü ä is required. This is an asymptotic Feynman Kac formula. More generally, most reasonable events, such as Ù ÚÀÛ"Ü Ý$Þ ó ß ý þ ÿ ä at a fixed time Þ ó â ãåä æ9çêé, will become exponentially rare as ð á ú, with the exponential falloff rate equalling the infimum of the above integral, computed over all trajectories Þêàá ÙÝ$Þ ß comprised by the event. The integral is a rate functional. 9

Optimal Trajectories Mathematical definition: An optimal trajectory for the indexed family of processes is the trajectory which (1) satisfies, (2) satisfies for some specified and ( final endpoint condition ), and (3) minimizes the rate functional. Intuitive definition: An optimal trajectory from at time zero to at time is the most probable such trajectory in the limit. In that limit, all trajectories that do not maintain a constant zero value are exponentially suppressed. The optimal trajectory extending to a point at time is the least suppressed one. Easy to check: for processes like, whose increment distribution is time- and valueindependent, the optimal trajectories are straight lines emanating from #" %$&'. (If, slope!.) 10

3 P O N P b 3 a d 3 d Extension to General Drift-Diffusion Processes A stochastic differential equation for an ( - indexed family of processes ) * +,-./,.10 2 : ) 4 56-7) /.98 (;:=<#>@?BAC-D) / (5 =value-dependent drift, A =normalized valuedependent volatility; Itô interpretation). The exponential falloff generalizes to: G HJILK M ( -7QR-. /TS9U QV-. / / 3FE. S ( W X Y For a scaled family of jump processes, P -DQZS\[%/ would be the Legendre transform of the (possibly valuedependent!) cumulant generating function of the increments. For this drift-diffusion process, -DQZS]U Q^/1_`4 A? -7Qc/ Q U M 5e-DQ^/? Y 11

i Š Š The Vector-Valued Generalization The stochastic differential equation for an f - indexed family of g h -valued drift-diffusion processes j k lmn o : p i q r1msi o p n9t f;u=vxw@ybz m{i o; p~} ( r =state-dependent drift field, z =normalized statedependent volatility matrix; Itô interpretation). The exponential falloff generalizes to: J ƒ f mˆ mn otš] ˆŒmno o p n f Ž where the Lagrangian function is defined by mĉš ˆRo1 `q ˆ r1mˆ o š u=v mdˆ ov ˆ r1mˆ o œ Here žq zÿz^ is the (in general, value-dependent) diffusivity matrix. 12

ª General Optimal Trajectories Optimal trajectories 1. are not straight lines in, in general. (Cf. the small-volatility limit of geometric Brownian motion.) 2. may be computed variationally by minimizing the rate functional. This leads to the Euler Lagrange equations % ª «L 3. are only a starting point. The probability density for the ² random ³ process equalling 9± at some «will fall off exponentially as µ, if ± is not on the integral curve of the drift field extending from the initial value of. The exponential falloff rate must be computed by integrating along the optimal trajectory terminating at this endpoint. 13

Æ Ä Æ Æ Í Í Í Æ Application to Stationary Distributions Let an ¹ -indexed family of º» -valued driftdiffusion processes ¼ ½ ¾ ÀÁ Â be defined as above by a state-dependent drift field Ã Ä Ã1ÀÅ Â, and a normalized state-dependent volatility matrix, ÀDÅZÂ. (Reminder: true volatility Ç ¹VÈ9ÉxÊ@Ë.) Suppose that these processes are globally stable, with unique stationary (i.e., time-invariant) probability distributions ÌÎÍ Ä Ì ½ ¾ ÀDÅZÂ, and that the drift Ã has a single attractor, near which Ì ½ ¾ becomes exponentially concentrated as ¹ Ï Ð. Example: Ã1ÀÅ Â Ñ`Ä ÒÓÅ and Æ Ñ`Ä ÆŸÔ yields a Õ -dimensional Ornstein Uhlenbeck process, with Å Ä Ö the only attractor: an isolated point. Here, Ì Í½ ¾ ÀÅ ÂÄ ÀØ 6Ù ËÚ ¹=Â È=ÉxÊ@ËÜÛJÝLÞ ÀßÒà¹ áâåãá Ë Ú Ë Â. Then, the asymptotics of Ì ½ ¾ can be worked out to logarithmic accuracy: Ì Í½ ¾ ÀÅ Âåä ÛÝƒÞ ÀßÒà¹=æ ÀÅ Â Â. 14

ü ÿ ÿ The Action Function ç]è é ê è ë 1. is zero on the attractor, and strictly positive elsewhere. 2. quantifies the frequency with which the random process ì í î ï visits the neighborhood of any point ð9ñò ó ô, in the õ ö (small-volatility) limit. It is a state-dependent exponential falloff rate. 3. can be computed from the rate functional, by taking the infimum not only over all trajectories ø ù øû ö ð ú extending from the attractor to ð ñ, but over all transit times. This yields the dominant fluctuational trajectory terminating at ð9ñ. û þžÿ Example: If ý1úð ð and þžÿ, then ÿ the dominant trajectory extending from ð to ð ð ñ is a straight line with nonuniform ø û ÿ speed, namely ðåú ð ñ. It has infinite ÿ transit time: it emerges from ð at ø ÿ and approaches ð ð ñ at, say, øåÿ. 15

" # Computing Dominant Trajectories and the Action Function A pattern of outgoing dominant trajectories, and the value of at each of their endpoints, can be generated more easily from Hamilton s equations (1st-order) than from the Euler Lagrange equations (2nd-order). For this, both the trajectory and an auxiliary momentum trajectory would be numerically generated, by:! % & # % ' ( / The Hamiltonian function "$# is defined by: )+*,-.)+ 0-.)12! 16

3 3 K 3 In theoretical physics, 4 would be an energy function, and energy is conserved: 4 5 687:9<;>= along any optimal or dominant trajectory. Remarkably, for dominant trajectories (with the infimum taken over all transit times, normally yielding an infinite transit time), this energy is zero. The trajectory? @A BCDEF G H I J H I lies in a zero-energy hypersurface, which has codimension unity in H I J H I. The numerical punchline: BCMLNFPO E QSR C TU?VO E QWT$CD the line integral being taken along the dominant trajectory from the attractor to C L. 17

X X X X x The Boundary Crossing Application Suppose the attractor for the drift Y on Z [ is in the interior of a region \, which is attracted to it. (Boundary is denoted ]^\, with coordinate _.) Goal: the boundary crossing location distribution for ` anb$c on ]d\, as e f g (small-volatility limit). Harmless assumption: ` anb$cihkjml is on the attractor. Result (Donsker Varadhan and Wentzell Freidlin, 1970s): If the action function n has a unique minimum on ]d\ at some point _1oSprq, then the crossing location distribution s anb$cih _ lut _ concentrates at _ v _1oSpwq as e f g. Formal extension: If n v n h _ l has a continuous 2nd derivative at _ v _1oSprq, then the crossing location distribution is asymptotically normal: y{zu h~} en h _ osprq l ƒ _ } _ ospwq ƒ ^ˆŠ $lut _$ 18

Œ Œ Œ š Precise Small-Volatility Asymptotics How to go beyond the small-volatility exponential falloff rate provided by the action function? How Žd Ž Ž to compute second partial derivatives efficiently? An analytic (non-probabilistic!) approach to the first: introduce a refined approximation N $ š œ- Vž Ÿ,œ {,œ- d ª «to the stationary solution N $ of the forward diffusion equation (a parabolic PDE): ²±³ µ Ž Ž Ž º¹» ¼,œ- :½ ¾ Ž Ž ¹ÁÀ,œ- :½ÃÂ 19

Ä Ä É Ô Ê È Û Æ É Ó Ä Substitute, and set the coefficients of Å.Æ and Å Ç to zero. Result: equations for both È and (new!) the pre-exponential function É. A Hamilton Jacobi equation for È : ËÌÍÏÎ Ë,Ì-Ð ÐPÑ ÒuÓ Solving this equation yields the familiar dominant trajectories ( zero-energy optimal trajectories ), and the familiar numerical integration scheme. An equation for the pre-exponential function É : Ñ Õ Î Ö1 ËÌ-Ð Ø Ù Ú Ü²Ý³Þµß à Ü¼Þ Ë,Ì-Ð ádâ È á ã Ü á ã Þ ËÌMäNÐ To yield É, this must be integrated along the dominant trajectory from the attractor to Ì ä. ËÌ-Ð 20

å å å ô õ õ ý ç ó õ A Matrix Riccati Equation How to compute the Hessian matrix æãç èêé¼ë ì í ækîdï{ç ðñî ò é î ò ë ì, along any dominant trajectory? Answer: Manipulation of the H. J. equation yields a matrix Riccati equation: èêé¼ë equals ç èøé ö ü ö ç èø Áë ô ö èø úù û ö ù û ç èêé ö1þ{ö èÿë æ ì ô þ,èêé¼ë úù û along any dominant trajectory. A triangular numerical scheme now follows: Compute a dominant trajectory ææ ìæì ì. Simultaneously, integrate the matrix Riccati equation along the dominant trajectory. Also simultaneously, integrate the equation for along the dominant trajectory. 21

/ / Potential Numerical Problems The coupled ordinary differential equations for "!#"$, for %, for the matrix &% ')(+*,-$, and. -"$, are quite stiff. Solution: Use a good integration algorithm. Dominant trajectories - extending from the attractor may cross, yielding multivaluedness of % and.. (More than one trajectory to 10!) Interpretation: The least-action dominant trajectory is the relevant one (since action is an exponential falloff factor). The prefactor., when integrated along outgoing dominant trajectories, may diverge. / Fact: This only happens on irrelevant dominant trajectories, which may bounce off caustics. 22

2 2 Exotic Extensions What if the drift field 3 on 4 5 has two attractors? Then as 6 7 8 (small-volatility limit), fluctuations from the vicinity of one to the other become exponentially rare. 9 The frequency of fluctuations between attractors can be computed from the flux of probability, in a transient situation, over the separatrix between them. Result: exponential falloff as 6 7 8, and the pre-exponential factor, are both computable. The limiting separatrix crossing location distribution is computable too. (See SIAM J. Appl. Math. paper). What if the drift field is periodically modulated, rather than static? 9 Work on an expanded state space: 4 5;: <>=@?BA C, rather than on 4 5. On this cylinder, dominant trajectories spiral out from a loop-like attractor. (See new preprint.) 23

D Summary and Conclusions 1. This approach is useful in numerically approximating exponentially small quantities, such as: D the probability that a Markov diffusion process with small volatility will wander far from where its drift would take it, and the frequency that a recurrent Markov diffusion process will undergo a fluctuation of specified size, in a specified direction. Moreover, it yields approximations to boundary crossing location distributions. 2. It extends large deviation theory, by constructing (analytically!) small-volatility approximations to solutions of the diffusion equation. 3. This approach owes much to theoretical physics. (Hamilton s eqns., PDE approximations, etc.). 4. Extensions to jump processes are possible. 24