On Lyapunov Inequalities and Subsolutions for Efficient Importance Sampling

Size: px

Start display at page:

Download "On Lyapunov Inequalities and Subsolutions for Efficient Importance Sampling"

Fay Morrison
5 years ago
Views:

1 On Lyapunov Inequalities and Subsolutions for Efficient Importance Sampling By Jose Blanchet, Kevin Leder and Peter Glynn Columbia University, Columbia University and Stanford University July 28, 2009 Abstract In this paper we explain some connections between Lyapunov methods and subsolutions of an associated Isaacs equation for the design of efficient importance sampling schemes. As we shall see, these two approaches are very related. Subsolutions, which arise by taking an appropriate limit of an associated Lyapunov inequality, have been recently proposed see e.g. [12, 14] and have been succesfully applied to many challenging problems in rare-event simulation [15, 11]. Lyapunov inequalities have been used for testing the efficiency of statedependent importance sampling schemes in heavy-tailed or discrete settings [5, 6, 4]. While subsolutions provide a powerful and versitile vehicle for constructing - through an analytic criterion - efficient samplers, Lyapunov inequalities are useful for finding more precise information on the behavior of the coefficient of variation of the associated importance sampling estimator. The use of Lyapunov inequalities also allow to gain insight into the various mollification procedures that are often required in constructing associated subsolutions. For instance, for a class of first passage time problems of multidimensional Markov random walks we show that a suitable mollification parameter allows to obtain a strongly efficient estimator in the sense that the coefficient of variation remains bounded as the probability of interest decreases. Finally, we show that an appropriate mollification scheme in a sampler proposed in [11] for the overflow probability at level n of the whole population of a Markovian tandem networks in a busy period has a coefficient of variation that grows at most at rate d+2 # of bottlenecks as n. 1

2 1 Introduction In recent years, there has been a substantial amount of research related to the efficient design of state-dependent importance sampling estimators. This research has been motivated by a series of examples and counter-examples see [17] and [18] related to the use of large deviations principles in the design of efficient importance sampling estimators for light-tailed systems, and also by the development of efficient importance sampling estimators for heavy-tailed systems, which as explained in [3] must often be state-dependent see also [2] for more on the challenges that arise in the context of efficient rare-event simulation for heavy-tailed systems. In order to systematically address the construction of efficient importance sampling estimators that can be rigorously shown to be efficient in large deviations environments for light-tailed systems P. Dupuis, H. Wang and their students have developed a method based on control theory and the use of subsolutions of an associated Isaacs equation for constructing and testing asymptotically optimal importance sampling estimators [12, 14, 11, 10]. A related approach, based on the construction of Lyapunov inequalities has been used in [5, 7, 6] for the construction and analysis of state-dependent importance sampling estimators for heavy-tailed systems. The Lyapunov method has also been used for counting bipartite graphs with a given degree sequence in [4]. In this paper we discuss connections between subsolutions of associated Isaacs equations and Lyapunov inequalities. We focus on situations involving light-tailed systems because the Isaacs equation is proposed by a limiting procedure which makes sense only in light-tailed settings under an appropriate large deviations scaling. The Isaacs equation arises in the context of zero-sum games. As pointed out by [12], such equation comes up in the analysis of importance sampling estimators by the introduction of an artificial player arising from a variational representation obtained by working with the logarithm of the value function of an associated control problem. In Section 3 we derive the form of a basic Lyapunov inequality for the analysis of state-dependent importance sampling algorithms for estimating first-passage time probabilities. We will derive subsolutions to an associated Isaacs equation by means of a suitable Lyapunov inequality in Section 4. As we shall see, the subsolution of the Isaacs equation obtained can be obtained as a limiting process starting from a suitable Lyapunov inequality by fixing the family of importance sampling distributions as that of exponential 2

3 changes-of-measure. The limiting process, which connects the verification procedure of efficiency of a given importance sampling estimator with the fluid scales at which the large deviations rates are obtained, is necessary for the derivation of the subsolutions. Since the limiting process requires the existence of certain derivatives, then the associated subsolutions must satisfy certain smoothness properties such as continuous differentiability. Often, when the associated dynamics are of random walk type, the subsolutions are obtained as the infimum of finitely many affine functions see [12] and therefore are not continuously differentiable. The approach, suggested in [12], is to mollify the subsolution in order to obtain the required differentiability. The mollification parameters, as we shall see, seem to play a substantial role in the efficiency of the algorithm. As we shall see in Section 5, the direct application of Lyapunov inequalities, without going through the process of taking the logarithm to work with the rate function, often yields a verification approach that is free of a limiting procedure. As a consequence, no additional mollification parameter is required, which in particular implies that more information on the performance of the importance sampling estimator can be obtained. The construction of Lyapunov inequalities often requires some ingenuity. However, we believe it is often the case that subsolutions to the associated Isaacs equation can be used to construct an appropriate Lyapunov function which, as we indicated before, can provide more precise information about the performance of the algorithm. In the second part of Section 5 we illustrate how, by avoiding the use of a mollification parameter and working directly with Lyapunov inequalities, one can construct strongly efficient estimators for a large class of multidimensional first-passage time problems for Markov random walks. Finally, in order to illustrate how the direct use of Lyapunov inequalities can provide useful information on the performance of state-dependent importance sampling estimators for light-tailed systems, in Section 6 we consider the class of Markovian tandem networks. We revisit the estimators proposed by [11]. Our analysis provides a refinement on the performance analysis of such estimators and guides the selection of the various mollification parameters involved in the construction of their sampler. Our analysis in Section 6 allows us to conclude that, in common circumstances, a suitably mollified version of the estimator from [11] has a computational complexity that is polynomial with a rate that is not worse than solving the associated linear system of equations corresponding to the exact solution of the overflow prob- 3

4 ability in question. In contrast, the analysis presented in [11] just concludes subexponential complexity of their proposed family of importance sampling schemes. The relationship that we see between Lyapunov inequalities and the subsolution of associated Isaacs equation is a symbiotic one, each approach benefits from the other. The technique of Lyapunov inequalities is a very general and powerful approach for establishing stability of dynamic systems stochastic and deterministic. If one can find the appropriate Lyapunov function then the analysis of importance sampling estimators is easily done with a Lyapunov inequality. Sometimes, as in the heavy-tailed settings illustrated in [7, 6], one can obtain a reasonable guess the form the form of the Lyapunov function by using fluid heuristics analysis. In the light-tailed case, however, it can be quite difficult to find the appropriate function. Fortunately, the work of Dupuis et. al. on subsolutions to an associated Isaacs equation can help resolve this issue in light tailed systems because the integral equations that arise in the corresponding Lyapunov inequalities are replased by differential inequalities which are often easier to solve using large deviations fluid analysis. In particular in the several examples we present in this work, a suitable Lyapunov function is constructed through the subsolution of an associated Isaacs equation. In this work we also observed that Lyapunov inequalities can improve the subsolution approach. As mentioned earlier through the refined bounds achieved through the Lyapunov method we were able to make better choices of the parameters used in the subsolution approach. To summarize in the light tailed setting, a family of importance sampling changes of measure can be constructed using the subsolution to an associated Isaacs equation, then one can select the optimal change of measure from the family, and analyze its performance, using a Lyapunov inequality. The rest of the paper is organized as follows... 2 Basic Notions on Efficiency of Rare-event Simulation Estimators Here we describe what is strong efficiency, weak efficiency and other notions of complexity. Notation α n = P A n and assume that α n 0 as n. We concentrate on unbiased estimators. An estimator R n is said to be strongly 4

5 efficient if ER n = α n and ERn 2 = O αn 2 as n. The estimator is weakly efficient or asymptotically optimal if for each ε > 0 we have that ERn 2 = O αn 2 ε as n. Finally, we say that the estimator has polynomial complexity of order at most l if ERn 2 = O αn 2 log 1/α n 2l, in other words, if its relative mean squared error or coefficient of variation grows at most at a polynomial rate with degree l in log 1/α n. In most cases the analysis of importance sampling estimators specially state-dependent samplers in light-tailed cases concludes only weak efficiency of the estimators. As we shall see in future sections, the use of Lyapunov inequalities often allows to lift such weak efficiency results to more precise statements involving polynomial complexity. Suppose that an estimator R n has polynomial complexity of order at most l and consider Z n,m = m i=1 R n i /m where the R n i s are independent replications of R n. It follows from Chebyshev s inequality that at most m = O ε 2 δ 1 log 1/α n 2l replications are required to conclude that Z n,m is ε-close to α n in relative terms with at least 1 δ confidence. 3 State-dependent Importance Sampling and Lyapunov Inequalities Let W = W n : n 0 be a Markov chain taking values on some state space S. Given disjoint sets A, B S define T = inf{n 0 : W n A B} and non-negative measurable function f. Suppose we are interested in developing bounds for the variance of a given state-dependent importance sampling estimator for the expectation u w = E w f W T I T <, where E w denotes the expectation operator in path space associated to the Markov process W given that W 0 = w. 5

6 Let us write K x, dy to denote the transition kernel associated to W ; in other words, P w W 1 A = K w, dy. A state-dependent importance sampling change-of-measure is specified by a Markov transition kernel A K x, dy defined via K x, dy = 1 K x, dy. r x, y for some positive function r such that r x, y 1 K x, dy = 1 for each x S. We use P w to define the path-measure induced by the kernel K given the initial condition w. A state-dependent importance sampling estimator for g w takes the form T 1 Y = r k=0 where the notation W = Wk, W k+1 f WT I T <, Wk : k 0 has been introduced to emphasize that the process follows the law induced by the Markov kernel K assuming W 0 = w. It is clear that u w = Ẽw Y, so Y is unbiased. The performance of the estimator Y can then be measured in terms of its mean squared error, which in turn is bounded by its second moment, namely, s r w = Ẽw Y 2 T 1 r Wk, W 2 2 k+1 f WT I T < = Ẽw k=0 T 1 = E w r W k, W k+1 f W T 2 I T <. k=0 The following inequality allows to find a bound for s r. Lemma 1 Suppose that there exists a non-negative function v, a constant ρ > 0 and δ 0 such that v w exp δ E w r w, W 1 v W 1 2 for w / A B and v WT I T < ρf WT I T <. Then, T v w 1 ρ Ẽw exp δt r Wk, W 2 2 k+1 f WT I T <. k=0 6

7 In addition, if T m/δ for some m 0,, then v w exp m /ρ s r w. Proof. The proof proceeds along the same lines as [5], we provide the details in order to make the discussion self-contained. First, define M n = v W T n T n 1 k=0 e δ r W k, W k+1. We note that M n is a non-negative supermartingale adapted to the filtration F n : n 0 generated by W. To see this, observe that E w M n+1 F n I T > n n 1 = e δ r W k, W k+1 E w v Wn+1 e δ r W n, W n+1 Fn I T > n k=0 n 1 v W n e δ r W k, W k+1 I T > n = M n I T > n. Therefore, k=0 E w M n+1 F n = E w M n+1 F n I T n + E w M n+1 F n I T > n We conclude that M n I T n + M n I T > n = M n. v w = M 0 E w M n E w v W T T 1 k=0 e δ r W k, W k+1 I T n T 1 E w ρf W T 2 exp δt r W k, W k+1 I T n. Sending n and using monotone convergence we obtain that k=0 T 1 v w /ρ E w f W T 2 exp δt r W k, W k+1 I T < w. The rest of the lemma follows easily from the previous expression. k=0 7

8 Remark 1 Note that we have introduced a parameter δ 0 in the statement of the previous Lyapunov inequality. Ideally, we would like to construct v based on the selection δ = 0; however, as we shall see below, it is often the case that introducing the parameter δ > 0 facilitates the construction of such a Lyapunov function v. Then, one has to deduce the inequality for s r using the problem structure, for instance, that T m/δ, other examples are discussed in future sections. Remark 2 It is important to note that we can select r w 0, w 1 1 = u w 1 /u w 0, in which case it follows that v w = u w 2 satisfies the Lyapunov inequality from the previous lemma with strict equality selecting δ = 0 and ρ = 1 and we have that u w s r w = E w u W T f W T 2 I T < = u w E w f W T I T < = u w 2. As a consequence selecting r w 0, w 1 1 = u w 1 /u w 0 describes the zerovariance change-of-measure. 4 Isaacs Equation, Subsolutions and Lyapunov Inequalities We shall consider the setting of state-dependent random walks. Consider a family of systems Y = Y t : t {0,, 2,...}, indexed by the parameter > 0, and taking values in a subset S of R d such that Y t+ = Y t + V t+ Y t, we assume that the V t y s are i.i.d. r.v. s independent and identically distributed random variables depending on the current position y. We have suppressed the explicit dependence of the Y t s on the parameter in order to simplify the notation. The constant > 0 is a scaling parameter which we shall send to zero in order to obtain corresponding large deviations estimates. Define the log-moment generating function ψ θ, y = log E exp θ T V t y 8

9 and assume that for each y S, ψ θ, y < for all θ R d this condition can be relaxed under special structure, for instance, if the V t y s do not depend on y. Settings in which the log-moment generating function is assumed to be finite, as in our case, are said to be light-tailed. Under suitable regularity conditions, for instance if µ = EV 0 is a Lipschitz function with at most linear growth see e.g. [9], it turns out that as 0, we have that Y y in probability over compact time intervals, where y = y t : t > 0 satisfies dy t dt = EV 0 y t, y 0 = y 0. The large deviations behavior as 0 i.e. probabilities associated to sample paths that deviate from y as 0 is dictated by the so-called rate function of the system, which is computed via where J w = 0 I w s, ẇ s ds, I w s, ẇ s = max θẇ s ψ θ, w s. θ In order to illustrate the use of large deviations analysis and its connections to importance sampling estimators and Lyapunov inequalities let us concentrate on a canonical example. Consider the problem of estimating u y = P y T A < T B, T A B < = P y Y T A, T <, 1 where, for any set C, T C = inf{t 0 : Y t C}. We put T = T A B and note that the explicit dependence on for the times T, T A and T B has been suppressed in order to simplify the notation. Assume that both A and B are open open sets for which y t : t 0 A = and y t : t 0 B. We then have, under suitable assumptions, that see e.g. [16] as 0 where log u y h y 2 h y = inf J w, w C where the infimum is taken over the set C of absolutely continuous functions such that w 0 = y, w t A for some t < and w s / A B for s < t. 9

10 Our goal is to find a family of state-dependent importance sampling estimators to calculate u x. Note that the problem of estimating u y fits the framework described in the previous section, to see this, just let f w = I w A. A family of importance sampling distributions that is often used in the context of light-tailed problems i.e. in the case in which ψ θ, y < as in our case is given by exponential changes-of-measure. In other words, given that Y t = y, the increment V t+ y is sampled according to the distribution P θy V t+ y v + dv = exp θ y T v ψ θ y, y P V t+ y v + dv. Note that the parameter θ y is allowed to depend on the current state y of the process. A larger class of importance sampling distributions, which are quite useful, as we shall indicate below, is given by state-dependent mixtures of exponential changes-of-measure. In other words, P V t+ y v + dv P V t+ y v + dv = l j=1 r j y exp θ j y T v ψ θ j y, y, where r j y 0 and l j=1 r j y = 1. For the moment, we shall concentrate on the use of standard exponential changes-of-measure but we shall return to the use of mixtures later. As we indicated in Remark 2 following Lemma 1, the zero-variance changeof-measure is selected by choosing r y 0, y 1 1 = u y 1 /u y 0. We shall derive the associated Isaacs equation, which dictates the optimal exponential change-of-measure as 0, as a natural approximation to the zero-variance change-of-measure. First, write υ x = 1 log u x. We shall use K x, y to denote the zero-variance change-of-measure, that is K x, y = P x, y u y u x, which is a well defined transition kernel because u x = Eu x + V x. Or, in terms of υ, we obtain 1 = E exp 1 υ x υ x + V x. 10

11 Assuming that υ x + v = υ x + x υ x v + o and that υ x and x υ x converge in an appropriate sense to υ x and x υ x, respectively we obtain that 1 = E exp x υ x V x + o 1, which yields ψ x, x υ x = 0. 3 Since υ x = on B and υ x = 0 on A, we also must have that υ x = for x B and υ x = 0 on A. Turns out as we shall see that only the boundary condition υ x = 0 plays a relevant role. For the moment, however, let us continue with our discussion including both boundary conditions. The equation 3 together with the associated boundary conditions is a so-called Isaacs equation and, not surprisingly given the large deviations result 2, is satisfied in a weak sense by setting υ x = h x. Now, let us connect the solutions of the Isaacs equation with its corresponding subsolutions via Lyapunov inequalities. Suppose we have a function, say g x : x S that is continuously differentiable. We wish to apply importance sampling according to exponential tiltings given by the gradient of g. In view of Lemma 1, we need to find a Lyapunov function v x such that E x v x + V x exp x g x V x + ψ x g x, x v x. We select v x = exp 1 g x and obtain that E x exp 1 g x + V x g x x g x V x + ψ x g x, x 1. If g is continuously differentiable, then we arrive at 1 g x + V x g x x g x V x + ψ x g x, x = x g x V x + o 1 x g x V x + ψ x g x, x = ψ x g x, x + o 1. 4 So, we require that ψ x g x, x 0, which is the subsolution property described in [14]. Following Lemma 1, the corresponding boundary condition is v x I x A for x A B, which is equivalent to v x ρ for x A or g x log ρ for x A. 11

12 Neglecting the contribution of order o 1 is handled using a relaxed Lyapunov inequality introducing a parameter δ 0 as in Lemma 1. At the end one needs to deal with the fact that the strict Lyapunov inequality is not satisfied but it may be violated by an amount of order o 1 as 0. Often, as we shall see in future sections, this can be done if we ensure that o 1 is actually o or even O as 0. It is important to note that the derivation above requires g to be continuously differentiable, otherwise we cannot guarantee the estimate o 1 above. Sometimes it is difficult or might be even impossible to find a continuously differentiable subsolution to the Isaacs equation. However, we might consider a smooth sequence of functions that violate the Lyapunov bound by an amount of order o as indicated above. These types of situations arise often in the context of queueing networks, as we shall discuss in future sections. 5 Markov Random Walks For simplicity we shall work with an irreducible and aperiodic finite statespace Markov chain Z = Z j : j 0 taking values on the space S. The discussion that follows can be easily extended to more general Markov chains under appropriate geometric ergodicity assumptions. However, our objective here is to illustrate how to construct associated Lyapunov functions in order to test efficiency therefore, for pedagogical reasons, we prefer to keep the technical conditions at a minimum. We introduce a sequence ξ j : j 1 of i.i.d. random variables and put X j = H ξ j, Z j for a given measurable, function H taking values on R d. Let us define φ θ, w = E exp θ T X j Zj = w and assume that there exists δ > 0 such that for all w S, φ θ, z < for θ δ. Moreover, let p z 0, z 1 : z 0, z 1 S be the probability transition matrix associated to Z and define Q θ z 0, z 1 = p z 0, z 1 φ θ, z 1. The Perron-Frobenius theorem see [1] indicates that there exists left and right strictly positive eigenvectors, π θ and u θ respectively, associated to a positive eigenvalue β θ = exp ψ θ such that π θ Q θ = β θ π θ, Q θ u θ = β θ u θ, 12

13 and for each z S 1 n log Qn θ 1 z = 1 n log E z exp θ T S n ψ θ as n. We shall assume that ψ 0 = 0 and that ψ is steep in the sense that for each y R d there exists θ y such that ψ θ y = y and we will put I y = θ T y y ψ θ y. Note that when X j s are i.i.d. one can simply take u θ = 1 and ψ θ = log E exp θ T X A Classical Time In-homogeneous Example We first consider a classical problem in rare-event simulation, namely, evaluating u n z 0 = P z0 S n /n A 1... A m, where A 1, A 2,..., A m are closed convex sets that do not contain the origin. We put I Ai = inf y Ai I y and set I = min 1 i m I Ai. The i.i.d. case was first considered in [21]. The following result is known see e.g. [19] Theorem 1 In addition to the steepness assumption on ψ indicated above, suppose that there exists z such that H ξ j, z is a continuous random variable. Then, there exists c z 0, c + z 0, n 0 > 0 such that for all n n 0 c z 0 /n d/2 u n z 0 exp ni c + z 0 /n 1/2. Our goal is to design a rare-event simulation algorithm whose coefficient of variation can be shown to grow at a polynomial rate in n. These types of problems have been considered by [13, 8]. The idea is first to select a family of changes-of-measure based on mixtures. In particular, given Z k 1 = z k 1, the next increment X k = H ξ k, Z k is simulated according to a finite mixture of transition laws of the form M θ z k 1, z k, x + dx = u θ, z k u θ, z k 1 exp θx ψ θ P X k x + dx Z k = z k p z k 1, z k, 13

14 where uθ, z k is the z k entry of the eigenvector u θ. Note that such probabilistic law is properly defined as long as ψ θ < in the sense that u θ, zk p z k 1, z k exp θ T x ψ θ P X k x + dx Z k = z k z k S = z k S = u θ, z k 1 u θ, z k p z k 1, z k exp ψ θ φ θ, z k u θ, z k 1 exp ψ θ u θ, z k 1 Q θ z k 1, z k u θ, z k = 1. z k S More precisely, at time k 1, given Z k 1 = z k 1 and S k 1 = s k 1, we propose to sample the pair Z k, X k according to the transition law P z k 1, s k 1, z k, x + dx = m j=1 r j k z k 1, s k 1 M θj z k 1, z k, x + dx, where r j > 0 and the θ j s are appropriately selected in order to satisfy an appropriate Lyapunov inequality. Note that we postulate the use of m functions r j, 1 j m. Each of these functions and the θ j s will correspond to each of the A j s. In particular, because of convexity, there exists a unique y j such that I Aj = I y j, we then put θ j = θ yj. The selection of r j s proceeds as follows. The corresponding Lyapunov inequality can be obtained by means of Lemma 1 by defining W k = k, Z k, S k /n,, A = {n} S A 1... A m and B = {n} S A 1... A m c. Applying Lemma 1 yields that we must find a non-negative function v k z, s such that E z v k+1 Z 1, s + X 1 /n v k z, s 1 m j=1 rj k z, s u θ j, Z 1 exp θj T X 1 ψ θ j /u θ j, z 1 subject to h n z, s ρ > 0 for each z S and s A 1... A m. We first introduce non-negative functions w j k z, s for 1 j m to be specified later and put r j k z, s = w j k z, s m i=1 wi k 14 z, s.

15 m 2 We propose v k z, s = j=1 wj k z, s so that E z v k+1 Z 1, s + X 1 v k z, s = E z 1 m j=1 wj k z, s Then, select w j k z, s such that 1 m j=1 rj k z, s u θ j, Z 1 exp θj T X 1 ψ θ /u θ j, z 5 m 2 j=1 wj k+1 Z 1, s + X 1 m j=1 wj k z, s u θ j, Z 1 exp θj T X 1 ψ θ j /u θ j, z. w j k+1 Z 1, s + X 1 /n = w j k z, s u θ j, Z 1 u θ j, z exp θ T j X 1 ψ θ j. 6 In particular, note that w j k z, s = exp nθ T j s kψ θ j + a j n u θ j, z 7 satisfies 6 for any a j n R. The constants a j n : 1 j m are selected depending on n in order to both, satisfy the boundary conditions and minimize the bound on the second moment implied by the Lyapunov inequality i.e. the value of v 0 z, 0. Note that selecting w j k as indicated in 7 yields that 5 equals m j=1 E z w j k+1 Z 1, s + X 1 m j=1 wj k z, s = = m j=1 m j=1 r j k z, s E z u θ j, Z 1 u θ j, z exp θ T j X 1 ψ θ j r j k z, s = 1 and therefore the Lyapunov inequality is satisfied assuming that the boundary condition holds given an appropriate selection for the a j n s. In order to appropriately select the a j n s we note that, by definition of y j, 0 θ T j y j ψ θ j θ T j s ψ θ j for all s A j. Therefore, selecting a j n = n θ T j y j ψ θ j, yields w j k z, s = exp nθ T j s kψ θ j n θ T j y j ψ θ j u θ j, z = exp nθ T j s y j + n k ψ θ j u θ j, z. 15

16 with w n j z, s u θ j, z for each s A j. This implies that if s A 1... A m, then v n z, s min u θ j, z 2 = ρ. 1 j m,z S In particular, Lemma 1 yields that the second moment of the corresponding importance sampling estimator, namely, R = satisfies n I S n /n A 1... A m m j=1 rj k Z k, S k u θ j, Z k exp θj T X k ψ θ j /u θ j, Z k 1 k=1 Ẽ z R 2 v 0 z, 0 ρ = m j=1 exp ni Aj u θj, z Theorem 1 combined with the previous analysis then yields the following result. Corollary 2 Under the assumptions of Theorem 1, R 2 has polynomial complexity. In particular, Ẽ z R 2 u n z 2 = O n d as d. Remark 3 The change of measure given by P, 8 of course corresponds to the change of measure suggested by the subsolution of Dupuis and Wang. When using the subsolution approach a mollification parameter, ɛ, must be chosen in order to create a smooth subsolution. [11] require choosing a sequence of mollification parameters ɛ that satisfy ɛ / as 0. The use of the Lyapunov inequality suggests that the selection of ɛ obeys to the fact that the verification procedure is performed at the level of the rate i.e. after taking log. The Lyapunov inequality also allows us to see that, in fact, we can choose ɛ =. 5.2 A Classical Time Homogeneous Example Now suppose that ψ 0 = µ 0 and let λ 1,..., λ m be linearly independent vectors in R d with unit norm and c i 0, for 0 i m. Assume also ρ 2. 16

17 that λ T i µ < 0 for all 1 i m. Let b > 0, define T i b = inf{n : λ T i S n c i b} and put T b = min 1 i m T i b. Consider the problem of estimating u b z = P z T b < efficiently as b. The previous formulation is a natural generalization of the problem of estimating the tail of the maximum of a Markov random walk with negative drift whose i.i.d. counterpart was first studied by Siegmund in his classical 1976 paper, [22]. Define X i k = λ T i X k /c i for all k 1 and 1 i m and set ψ i η = ψ ηλ T i /c i. Using this notation we are ready to state the following Theorem which follows easily from results in [1]. Theorem 3 In addition to steepness of ψ assume that there exists z such that H ξ j, z is a continuous random variable. Then, there exist constants c z, c + z 0, such that c z m exp η i b u b z c + z j=1 m exp η i b. as b, where η i is the only positive solution to the root ψ i η i = 0 for 1 i m. j=1 The description of the importance sampling strategy is similar to that given in the previous subsection. We first define θ i = η i λ T i /c i for 1 i m. Then, given Z k 1 = z k, S k 1 = s k 1 and that T b k sample the pair Z k, X k according to the transition law P z k 1, s k 1, z k, x + dx = m r j z k 1, s k 1 M θj z k 1, z k, x + dx, 8 j=1 where M θj z k 1, z k, x + dx = u θ j, z k u θ j, z k 1 exp θ jx P X k x + dx Z k = z k p z k 1, z k. 17

18 Once again, as in the previous subsection, we set r j z, s = w j z, s m j=1 wj z, s, for appropriate functions w j. According to Lemma 1 letting W j = Z j, S j, T A = T b and B = { }, we obtain that we must find a non-negative function v z, s such that E z v Z 1, s + X 1 v z, s 1 m j=1 rj z, s u θ j, Z 1 exp θj T X /u θj, z subject to the boundary condition v z, s ρ > 0 if λ T i s c i b for some i {1,...m} and each z S. We postulate m 2 v z, s = w j z, s. j=1 Substituting v z, s and the r j in terms of the w j s we obtain that v Z 1, s + X 1 1 E z v z, s m j=1 rj z, s u θ j, Z 1 exp θj T X 1 /u θj, z m j=1 = E wj Z 1, s + X 1 m j=1 z wj Z 1, s + X 1 m j=1 wj z, s m j=1 wj z, s u θ j, Z 1 exp θj T X. 1 /u θj, z Selecting for any a j b R yields w j z, s = exp θ T j s + a j b u θ j, z m j=1 wj Z 1, s + X 1 m j=1 wj z, s u θ j, Z 1 exp θ T j X 1 /u θj, z = 1 18

19 and therefore, E z m j=1 wj Z 1, s + X 1 m j=1 wj z, s = = m j=1 m j=1 w j z, s exp θj T X 1 u θj, Z 1 E z m j=1 wj z, s u θ j, z exp θ T r j j X 1 u θj, Z 1 z, s E z = 1, u θ j, z which implies that the Lyapunov inequality 9 is satisfied assuming that the a j b s are selected so that the boundary condition holds and the value of h z, 0 is minimized recall that we wish to obtain an upper bound. In order to select the a j b s note that w j z, 0 = exp a j b u θ j, z, a natural candidate consists in selecting a j b = η j b. In turn, note that θ T j s η j b = η j λ T j s c j b /c j. Therefore, we conclude that if λ T j s c j b for some 1 j m, then we must have that h z, s min u θ j, z 2 = ρ 1 j m,z S and therefore defining R = T 1 m j=1 rj Z k 1, S k 1 u θ j, Z k exp θj T X k /u θj, Z k 1, k=1 m 2 we obtain ẼzR 2 j=1 exp η jb u θ j, z /ρ, which implies, together with Theorem 3 the following result. Corollary 4 Under the assumptions of Theorem 3 the estimator R is strongly efficient in the sense that Ẽ z R 2 u b z = O 1 as b. 19

20 Remark 4 Similar to the finite horizon problem we use the change of measure suggested by the subsolution approach. However the Lyapunov inequality is an extra tool that we can use to gain more refined information about the asymptotics of the second moment. In addition it provides insight into the selection of the mollification parameter, as in the finite horizon problem. Note that here it follows because of strong efficiency that ɛ = is actually the optimal mollification parameter. 6 Tandem Networks We consider a tandem network with d stations, indexed from 1 to d in such a way that the the i-th station s output goes to the i + 1-th station. The arrival rate of the process is µ 0 > 0 and the service rate at station i is µ i > 0. The process is assumed to be stable so ρ i = µ 0 /µ i 0, 1. Without loss of generality we assume that µ 0 + µ µ d = 1. Put µ = min 1 i d µ i, ρ = µ 0 /µ, γ = log ρ and let β be the number of bottleneck stations, that is, β = {1 i d : ρ i = ρ }. We are interested in estimating overflow probabilities at level 1/ when 0 for the whole population of the network within a busy period. In order to describe the overflow probability of interest in a suitable scale depending on > 0 let us introduce an appropriate family of processes corresponding to the embedded discrete time Markov chain appropriately scaled by the factor. Following the notation introduced in Section 4, let Y = Y t : t {0,, 2,...}, indexed by, defined via Y t+ = Y t + V t+ Y t, where the V t y s are independent random variables whose distribution is described as follows. First consider the power set A = 2 {1,2,...,d} i.e. the family of subsets of {1, 2,..., d}. Given A A we define S A = {y Z d + : y i = 0 for i A and y j > 0 for j / A}. In other words, S A contains states describing situations where only the queues whose index belong to A are empty. The set S is called the interior of the 20

21 space and each of the remaining S A s describe the boundaries. Now, define, for 1 i d 1 e i = 0,..., 0, 1 i-th element, 1, 0,..., i+1-th element 0T, e 0 = 1,..., 0, 0, 0, 0,..., 1st element 0T, e d = 0,..., 0, 0, 0, 0,..., 1 T. Then, we have that if y S, then More generally, if y S A then, d-th element P V y = e i = µ i 0 i d. P V y = 0 = i A µ i if i A and P V y = e i = µ i if i A c, where, by convention, the operation A c is taken relative to the set {0, 1,..., d}. The log-moment generating function at state y is defined as ψ θ, y = log E exp θ T V y. Lastly define the d-dimensional vector of all ones as 1. Note that the process 1 Y k is the state of the embedded discrete time Markov chain associated to the underlying tandem network evaluated at time k. Define T 0 = inf{k 1 : Y k = 0} and T 1 = inf{k 1 : Yk T 1 1}. We are interested in analyzing a class efficient state-dependent importance sampling estimators for u = P 0 T 0 < T 1 as. The class of samplers that we consider are those proposed by [11]. Our objective is to use Lyapunov inequalities in order to find further insights into the various mollification parameters introduced by Dupuis et al and also obtain more information on the behavior of the second moment of the proposed estimator, beyond the asymptotic behavior of its optimal exponential rate of convergence. In particular, the main result of this section is the following. Theorem 5 There exists a suitable selection of mollification parameters in the subsolution-based change-of-measure of [11] so that the corresonding estimator, R y, satisfies Ẽ R y 2 u 2 = O 2d+2 β 10 as 0. 21

22 The rest of the section is dedicated to the analysis of Theorem 5, however, before continuing with such analysis, let us point out the implications behind 10 in terms of the computational cost associated to the importance sampling estimator underlying Theorem 5. As we indicated in Section 2, Chebyshev s inequality implies that O 2d+2 β replications are required to obtain an estimator with a prescribed relative accuracy. On the other hand solving the associated linear system of equations to compute P y T 1 < T 0, as a function of y Z d +, using say Gaussian elimination will typically take at least c 0 2d operations for some c 0 > 0. As a consequence, if β 2 Theorem 5 guarantees that the corresponding importance sampling scheme has no worse complexity rate as a function of than solving the associated linear system of equations by Gaussian elimination. Furthermore, as we shall see, our analysis depends on a lower bound of the form u c exp γ/ / β 2, which can be improved in order to replace 10 by O 2d+1 β, a result that, in turn, implies that the corresponding importance sampling scheme indicated in Theorem 5 can be guaranteed to have, in general, at most the same complexity rate as Gaussian elimination. We now are ready to move into the development behind Theorem 5. We start our discussion by stating the following simple lower bound on u. Proposition 1 There exists a constant c 0, such that u c exp γ/ / β 2. Proof. We write Y k / := X k := X 1 k,..., X d k T to describe the state at time k of the embedded discrete time Markov chain corresponding to the tandem network. It is known that X k : k 0 possesses a stationary distribution of product form given by P X = x 1,..., x d T = π x 1,..., x d = d i=1 1 ρ i ρ x i i. Let us define τ = inf{k 1 : X k T 1 1/ } and τ 0 = inf{k 1 : X k = 22

23 0}. Regenerative theory tells us that P X T 1 1/ = equivalently, u = = = τ0 1 E 0 k=0 X I k T 1 1/ E 0 τ0 1 k=τ I E 0 τ 0 X k T 1 1/, τ < τ 0 E 0 τ 0 τ0 1 E 0 k=τ I X k T 1 1/ τ < τ 0 u, E 0 τ 0 P X T 1 1/ E 0 τ 0 τ0 1 E 0 k=τ I X k T / τ < τ 0 It follows easily, using a simple Foster-Lyapunov criterion, that there exists a constant c + 0, such that E x τ 0 c x T 1 for x > 0 component by component. Therefore, τ0 1 E 0 I X k T 1 1/ τ < τ 0 k=τ E 0 EXτ τ 0 τ < τ 0 τ < τ 0 = E 0 EXτ τ 0 τ < τ 0 E 0 c X τ T 1 τ < τ 0 c /. for all δ 0 for some δ 0 0,. We then obtain from 11 that P X T 1 1/ E 0 τ 0 u. c / On the other hand, it follows by simple properties involving negative binomial distributions that there exists a constant c 0 0, such that P X T 1 1/ c 0 exp γ/ β 1 as 0. 23

24 Remark 5 The lower bound given in the previous proposition is not sharp. Note, for instance, in the case d = 1 = β, the premultiplying factor is of constant order and not of order as suggested by the proposition. A more precise analysis requires a somewhat more delicate argument and will be given elsewhere. Our goal here is just to illustrate the power of Lyapunov methods and therefore we prefer to keep the discussion as streamline as possible. We now proceed to the construction of associated Lyapunov inequalities. In order to describe the issues that arise when constructing appropriate Lyapunov functions let us start our discussion with the case of a two node tandem network studied by Parek-Walrand,[20] A Two Node Tandem Network Let us first consider the case d = 2 and assume that µ 0 < µ 2 µ 1. As in Section 5 it is natural to consider mixtures of exponential changes-of-measure. In particular, P V y dz P V y dz = L y = k r j y exp θj T z ψ θ j, y, 12 j=0 for an appropriate value k > 0, some θ j s and given functions r j s. Once again, motivate by the analysis in Section 5 we propose setting r j y = w j y k j=0 w j y with w j y = exp θ T j y/ + a j. We must find an appropriate Lyapunov function, which, once again following the arguments in Section 5, we propose k 2 v y = w j y. j=0 24

25 Verifying the Lyapunov inequality would entail showing v y + V y 1 E v y k j=0 r j y exp θj T V y ψ θ j, y k j=0 w j y exp θj T V y k j=0 w j y exp θj T V y = E k j=0 w j y k j=0 w j y exp θ T j V y ψ θ j, y An interesting twist arises in the current setting, which did not occur in Section 5. The issue is that we may not be able to guarantee that ψ θ j, y 0 for all y, j and therefore 13 does not follow immediately as we saw in Section 5. However, the observation noted by [11] is that one can select the constants a j s so that k j=0 w j y exp θj T V y 2 E k j=0 w jy k j=0 w jy exp θj T V y ψ θ j, y = 1 + ε, for some error ε that can be appropriately controlled. In particular, taking advantage of the associated subsolution to the Isaacs equation introduced by, [11] one takes k = 2 and θ 0 = γ 1, 1 T, θ 1 = γ 1, 0 T, θ 2 = 0, 0 T, a j = γ/ + jδ/, 0 j 2. The parameter δ > 0 will be selected as a function of in order to approprietely control the error ε. As we shall see, the Lyapunov inequality suggests a suitable selection that minimizes the growth of the upper bound on the second moment of the estimator, namely, δ = log 1/. The overall estimator then takes the form R y = L Y j 1 I T 1 < T T 1 1 j=0 In order to discuss the properties implied by the previous selection of parameters recall S = {y 1, y 2 Z 2 + : y 1, y 2 > 0}, S 1 = {y 1, y 2 Z 2 + : y 1 = 0, y 2 > 0}, S 2 = {y 1, y 2 Z 2 + : y 1 > 0, y 2 = 0}. 25

26 It is easy to verify that: 1. If y S then ψ θ j, y 0 for j = 0, 1, If y S 1 then ψ θ j, y = 0 for j = 0, 2 but ψ θ 1, y can be positive. 3. If y S 2 then ψ θ j, y 0 for j = 1, 2 but ψ θ 0, y can be positive. Given the previous observations, it follows immediately that 13 holds in the interior, namely in the set S. The cases y S 1 and y S 2 are similar, so let us just explain what occurs if y S 1. Define G y = = 2 2 j=0 w j y + V y 2 j=0 w j y 2 j=0 r j y + V y exp ψ θ j, y 2 j=0 r j y exp θ T j V y 2 2 j=0 r j y exp θ T j V y ψ θ j, y. We note that on the set S 1 we have that 13 can be violated simply because of w 1 y, however, we note that on y S 1, w 1 y /w 0 y + w 2 y exp δ/. Therefore if y S 1, G y = 2 2 j=0 r j y exp θ T j V y 2 2 j=0 r j y exp θ T j V y ψ θ j, y 2 j=0 r j y exp θ T j V y 2 j=0,2 r jy exp θ T j y ψθ j, y j=0,2 r j y exp θj T V y 2 + r 1 ye θt j V y j=0,2 r jye θt j V y r j ye θt j V y + 2me δ/, j=0,2 for some constant m 0, which can be suitably chosen. Note that for y S 1, E r j ye θt j V y 1. j=0,2 A similar argument follows for the case y S 2. A key observation in such case is that w 0 y /w 1 y+w 2 y exp δ/ ; the argument then follows just as before. 26

27 In summary, defining ε = exp δ/, we obtain that EG y = E v y + V y v y 1 2 j=0 r j y exp θj T V y ψ θ j, y 1 + mε, 15 evidently, we also have that v y 1 whenever y T 1 1. implies, choosing δ so that mε < 1/2, Lemma 1 then v y E exp mε T 1 R y, 16 We still need to transform 16 into an inequality for the second moment of the estimator i.e. ER y = ẼR y2. We will comeback to this issue later, after developing the corresponding Lyapunov inequality for a d-node tandem network. The idea follows the same spirit as in the second part of Lemma 1 by arguing that T 1 λ/ with very high probability and choosing ε = i.e. δ = log 1/. 6.2 Relaxed Lyapunov Inequality for a Markovian Tandem Network Once again, we depart from the subsolution to the Isaacs equation proposed by [11]. In particular, we define the i-th component of the vector θ j for 0 j k = d and 1 i d { γ, 1 i d j θ j i = 0, otherwise and put w j y = exp θ T j y/ γ/ + jδ/. The rest of the description of the sampler proceeds just as in the previous section, namely, we set r j y = w j y / d j=0 w j y and P V y dv as indicated in 12. We wish to test a Lyapunov inequality such as 15, therefore we define the Lyapunov function d 2 v y = w j y. 17 j=0 27

28 Just as it occurred in the two dimensional case, introducing δ > 0 allows us to handle states in which at least one queue might be empty i.e. potential cases in which ψ θ j, y > 0 might occur. In order to see how one handles these states define, given y Z d +, B y = {0 j d 1 : y d j = 0}. This collection of indices will represent the roots θ j that should be avoided when the scaled queue lenghth is at y, i.e. ψθ j, y > 0. In particular we have the following lemma the results of this lemma are very similar to those found in Lemma 4.4 of [11], however we include the details of the proof here to ease confusion. Lemma 2 If j / B y then ψ θ j, y 0. If j By then ψ θ j, y > 0 and r j y w j y k / By w ky exp δ/. 18 Proof. First note that we only need to consider θ j for 0 j d 1, since by definition d / By and it is trivially seen that ψθ d, y = 0 for all y. Thus consider 0 j d 1, θ j, and 1 i d and observe { θj T 0, d j i e i = 19 γ, d j = i. Next for A A and y S A we have ψθ j, y = log µ 0 e θt j e 0 + i/ A µ i e θt j e i + i A µ i. If j / By then by definition d j / A and therefore the previous display and 19 gives ψθ j, y = log µ + µ 0 µ i + µ d j 0. µ i d j The details for the above inequality can be found in [11]. 28

29 If j By then we see that ψθ j, y = log Thus it remains to prove 18. For j By define µ + d µ i > 0. i=1 j = min {k > j : y d k 0} d, and note that by definition j / By. In addition we have the following d j d j log w j y = θj T y γ + jδ = γ y k γ + jδ = γ y k γ + jδ k=1 = log w j y j jδ. Using the previous display, 18 follows immediately. k=1 We now are ready to prove the following Lyapunov inequality. Indeed, set v y + V y 1 G y = v y d j=0 r j y exp θj T V y ψ θ j, y = d j=0 r j y exp θ T j V y 2 d j=0 r j y exp θ T j V y ψ θ j, y. It follows from Lemma 2 and simple algebraic manipulations entirely analogous to those performed in the case d = 2 that G y d j=0 r j y exp θ T j V y 2 j / By r j y exp θ T j V y j / By r jy exp θj T V y + m 0 exp δ/ j / By r jy exp θj T V y j / By r j y exp θ T j V y + m 1 exp δ/. 20 We then conclude, combining the previous analysis together with Lemma 2 and Lemma 1 to get the following result 29 2

30 Proposition 2 If δ := δ = log 1/ then, there exists m 1 0, such that E v y + V y v y 1 d j=0 r j y exp θj T V y ψ θ j, y 1 + m 1, for y 0 and y T 1 1. Since v y 1 if y T 1 1 we obtain v y E exp m 1 T 1 R y, where R y = L Y j 1 I T 1 < T 0. T 1 1 j=0 The final ingredient in the proof of Theorem 5 consists in transforming the bound for E exp m 1 T 1 R 0 given in Proposition 2 into a bound for ER 0. First fix λ 0, and note that e m 1λ E [R0] = e m 1λ E [R0; T 1 > λ/, T 1 < T 0 ] + E [R0; T 1 λ/, T 1 < T 0 ] e m 1λ E [R0; T 1 > λ/, T 1 < T 0 ] + E [ e m T 1 R0; T 1 λ/, T 1 < T 0 ]. We will argue that for every κ 0, there exists λ 0, such that E [R0; T 1 > λ/, T 1 < T 0 ] = o exp κ/. 21 Then using the non-negativity of R0, the result of Proposition 2, and the previous display with κ > 2γ we see that E [R0] E [R0; T 1 > λ/, T 1 < T 0 ] + e m 1λ E [ e m 1 T 1 R0 ] o exp κ/ + e m 1λ v0 = O 2d exp 2γ/. Therefore it remains to prove the result in 21. First an application of the Cauchy-Schwarz inequality gives E [R0; T 1 > λ/, T 1 < T 0 ] E [ R0 2] 1/2 P T1 > λ/, T 1 < T 0 1/2. From Lemma B.1 of [10] it follows that there exists c > 0 such that E [ e ct 0 ] <

31 Therefore we have that given κ 0, there exists λ > 0 such that P T 0 > λ/ = o exp κ/. Thus display 21 follows if we can show that E [R0 2 ] stays bounded as 0. Recalling the definition of R0 from equation 14 we have E [ R0 2] T1 1 2 = E IT 1 < T 0 LY j 1. Recalling the Lyapunov function vy from display 17 and using similar arguments as in display 20 observe the following inequality v y + V y vy = d d 1 j=0 j=0 r jy exp [ θ T j V y ψθ j, y ] 2 d j=0 r jye θt j V y 2 j=0 r jy exp [ θ T j V y ψθ j, y ] Ke δ/, where K is a constant depending only on the system parameters. Importantly note that this inequality holds pathwise, i.e., it is does not involve expected values. Therefore if we take δ = log we have the following E[R0 2 ] = E IT 1 < T 0 vy 0 vy T1 T 1 1 k=0 vy k+1 vy k d r j Y k exp [ θj T V Y k ψθ j, Y k ] 2 E[IT 1 < T 0 vy 0 vy T1 1 + K T 1 ] 2E [IT 1 < T 0 expt 1 log1 + K ], where the last inequality follows from the definition of the function v and the stopping time T 1. The result then follows from equation 22. This completes the proof of Theorem Analysis with General Mollification Parameters A noted difference between the importance sampling algorithm discussed in this work and the one considered in [11], is the latter algorithm used a general 31 j=0

32 sequence of mollification parameters while the present algorithm simply uses. In this section we would like to carry out our analysis under the condition of varying mollification parameters, ε. In particular we consider a variant of the change of measure described by 12, P V y dz k P V y dz = L y = rj ε y exp θ j z ψ θ j, y 23 where and r ε jy = j=0 wjy ε d 24 k=0 wε j y, w ε jy = exp [ θ T j y γ + jδ /ε ]. 25 We can assume that ε and δ vary with, but will supress this notation. The Lyapunov function that we will use in this situation is d 2ε/ v ε y = wjy ε. j=0 Therefore in order to apply Lemma 1 we need to establish the following inequality { } v ε y + V y 1 Gy = E v ε y d j=0 rε j y exp θj T y ψθ j, y d j=0 rε jy exp ε θt j y 2ε/ = E d j=0 rε j y exp θj T y ψθ j, y 1. Under the assumption that ε for all > 0 we can apply Jensen s inequality to the expression for Gy and get d j=0 rε jy exp θj T y 2 Gy E d j=0 rε j y exp θj T y ψθ j, y

33 However the term in the previous display can be analyzed using the same techniques in the proof of Lemma 2, and we arrive at Gy 1 + e δ/ε. Therefore if we can ensure that e δ/ε goes to zero sufficiently fast with we can say that the performance of the algorithm is determined by v ε 0 e 2γ/ 1 2εd 1/. Based on this analysis it appears that little can be gained from introducing a mollification parameter ε that varies with. Remark 6 There is a slight difference between the algorithm described in this section and the algorithm described in [11]. In particular both algorithms use a mixture of exponentials twists according to the vectors θ j, but in Dupuis et. al. the weights are defined as w ε j = exp [ 2θ T j y 2γ + jδ /ε ]. However the two algorithms can in fact be mapped to each other in a simple fashion. In particular for any δ and ε one uses in the algorithm of [11], one would get an identical algorithm by considering the algorithm in equations 23 to 25 with δ = 2δ and ε = 2ε. In [11] the parameter selection is made according to the rules 1. δ 0 as 0, 2. ε/ as 0, 3. δ/ε as 0. With these choices made it is possible to show that the importance sampling algorithm is asymptotically efficiently, i.e. the exponential decay rate of the second moment matches the exponential decay rate of the probability squared. However in the paper [11] explicit bounds are obtained on the second moment, and from these it is possible to obtain further information on the decay rate of the second moment of the importance sampling estimator. In the following we pursue this goal. 33

34 The particular bound achieved is equivalent to the following [ E [R0] exp 1 + α ] 1 + 2α W 0 E exp β 1 + αδ 2α T 0, where and W 0 = ε log d k=0 kδ exp ε 2γ/ε, β C and α Kβ ε for a positive constant K. In addition we have the following bound for positive α 1 + α 1 + 2α 1 C ε. Putting this together with the assumptions 1-3 above we have we have the following bound for positive constant M [ E[R0] M exp 1 C 2γ ε δd ] M exp 2γ/ + δd + 2γC. δ Minimizing the above exponent over δ, we see that the optimal choice of δ = C 0 1/2 gives the following bound E[R0] M exp 2γ/ + C 0 1/2. This is clearly a much looser bound than that obtained in the result of Theorem 5. References [1] S. Asmussen. Applied Probability and Queues. Springer-Verlag, New York, NY, USA, [2] S. Asmussen, K. Binswanger, and B. Hojgaard. Rare events simulation for heavy-tailed distributions. Bernoulli, 6: ,

State-dependent Importance Sampling for Rare-event Simulation: An Overview and Recent Advances

State-dependent Importance Sampling for Rare-event Simulation: An Overview and Recent Advances By Jose Blanchet and Henry Lam Columbia University and Boston University February 7, 2011 Abstract This paper