On Lévy measures for in nitely divisible natural exponential families

On Lévy measures for in nitely divisible natural exponential families Célestin C. Kokonendji a;, a University of Pau - LMA & IUT STID, Pau, France Mohamed Khoudar b b University of Pau - IUT STID, Pau, France Abstract It has appeared that in nitely divisible distributions are increasingly de ned in terms of their Lévy measures. Let be an in nitely divisible positive measure on (not necessarily a probability). In terms of variance function, we simply characterize the natural exponential family G generated by a modi ed Lévy measure = () from the one F generated by. This connection points out, in particular, that if F admits a polynomial variance function of degree p (p 2), then the variance function of G has always a quadratic form. For -stable processes with < 1 or p-power variance function with p > 1, the corresponding measures are always gamma distributions. Some other examples related to compound Poisson processes of Hinde-Demétrio and leading to as negative binomial families are given as their new characterizations. Key words: Compound Poisson process, gamma distribution, Lévy process, negative binomial distribution, stable process, variance function. 1991 MSC: Primary 60E07; Secondary 60G51, 62E10. Abbreviated title: On Lévy measures. Address for correspondence: C.C. Kokonendji. Université de Pau et des Pays de l Adour. Laboratoire de Mathématiques Appliquées - CNS UM 5142. Département STID. Avenue de l Université. 64000 Pau, France. Phone: +33 559 407 145. Fax: +33 559 407 140. Email address: celestin.kokonendji@univ-pau.fr (Célestin C. Kokonendji). Preprint submitted to LMA: Technical eport No. 0504 25 February 2005

1 Introduction In nitely divisible probability measures, de ned to have the nth roots in the convolution sense for any positive integer n, are intimately connected to the Lévy processes (e.g. Bertoin, 1996; Sato, 1999). A Lévy process associated with the in nitely divisible law is simply a stochastic process X = fx t ; t 0g X with stationary and independent increments (in other words, X s+t X s is independent of fx r : r sg and has the same distribution as X t ) and X 0 = 0. Throughout this paper we adopt the convention that all Lévy processes are càdlàg; that is, its sample paths are right continuous with limits from the left. In order to study the behaviour of Lévy process X = X generated by, we can use the well-known Lévy-Khintchine characterization: a probability measure on is in nitely divisible if and only if there exist ; 2 and a positive nite measure on f0g satisfying min 1; x 2 (dx) < 1 (1) f0g such that the characteristic function of has the form ( e ix 2 2 h (dx) = exp i + e ix 1 i(x) i ) (dx) ; (2) 2 f0g where is some xed bounded continuous function on such that ((x) x)=x 2 is bounded as x! 0 (e.g. (x) = x=(1 + x 2 ) and (x) = sin x). In this case, the triple (; 2 ; ) is unique, and the measure = () satisfying (1) is called the Lévy measure of. From (2), the rst characteristic is connected to the drift of the process X = X, whereas 2 is the in nitesimal variance of the Brownian motion part of X, and determines the probabilistic character of the jumps of X. For example, a compound Poisson process is a Lévy process X with = = 0 and is a nite measure. When is not a nite measure, X is not a compound Poisson process; and, hence, X has in nitely many jumps in every nite time interval of strictly positive length. Hence, it appears that Lévy processes and, also, in nitely divisible distributions are increasingly de ned in terms of their Lévy measures. From (1) we have the following classi cation of Lévy measure = () in three types. If is bounded then the Lévy measure (or the associated Lévy process) is said to be of type 0. If is unbounded but such that f0g min (1; jxj) (dx) < 1 the Lévy measure is said to be of type 1. It is said to be of type 2 if f0g min (1; jxj) (dx) diverges. The (centering) function is not useful for types 0 or 1 and we can therefore consider (x) = 0 in (2). When these in nitely divisible distributions belong to natural exponential families, which are characterized by their variance functions, this paper pro- 2

poses a simple expression of variance function for a modi ed Lévy measure = () = () from the one of. The new formulation can make easier the classi cation of the Lévy measure given in (1) and (2). Section 2 recalls some basic properties of natural exponential families and their variance functions. Section 3 shows the main result with some usefull consequences. Sections 4 and 5 are respectively devoted to the classical situation of stable processes and to new characterization of the Hinde-Demétrio processes (Kokonendji et al., 2004), which are particular cases of compound Poisson processes. 2 Natural exponential family (NEF) It is well-known that (natural) exponential families of probability measures represent a very important class of distributions both in probability and statistical theory (e.g. Küchler and Sørensen, 1997). Here, we brie y recall some notation and elementary properties concerning natural exponential families on the real line and their variance functions (e.g. Kotz et al., 2000, Chapter 54). Let M denotes the set of positive measures on (possibly unbounded) not concentrated at one point, with the Laplace transform of given by L () = expfxg(dx) and such that the interior () of the interval f 2 : L () < 1g is non-void. Note that 2 M is not necessarily a probability. We may denote K () = ln L (), the cumulant function of. For 2 M, the set F = F () of probabilities P (; )(dx) = expfx K ()g(dx); when varies in (), is called the natural exponential family (NEF) generated by. The measure is said to be a basis of F. Note that F () = F ( 0 ) if and only if there exists (a; b) 2 2 such that 0 (dx) = expfax + bg(dx); hence, one says that and 0 are equivalent. The function K () is strictly convex and real analytic on (). Since K() 0 = xp (; )(dx); the set M F = K 0 [()] is called the mean domain of F and depends only on F and not on the particular choice of. For each m 2 M F, (m) is the unique element of () such that K 0 ((m)) = m. The map m 7! P ((m); ) = P (m; F ) being a bijection between the sets M F and F, we can parametrize F by the mean m and obtain for each m 2 M F the variance function of P (m; F ) 3

as V F (m) = (x m) 2 P (m; F )(dx) = K 00 ((m)): Together with the mean domain M F, the variance function V F characterizes the family F within the class of all NEFs. Moreover, for many common distributions in, V F presents an expression simpler than the density of P (m; F ) (e.g. Morris, 1982; Letac and Mora, 1990; Bar-Lev et al., 1992). The most simple variance functions have been investigated by Morris (1982) who points out that the only variance functions which are restriction to an interval of some quadratic polynomial belong to one of the six following families (up to an a ne transformation and up to power of convolution): (i) normal: M F = and V F (m) = 1; (ii) Poisson: M F = (0; 1) and V F (m) = m; (iii) binomial: M F = (0; 1) and V F (m) = m(1 m); (iv) negative binomial: M F = (0; 1) and V F (m) = m(1 + m); (v) gamma: M F = (0; 1) and V F (m) = m 2 ; (vi) hyperbolic: M F = and V F (m) = 1 + m 2. To conclude this section, we rst recall two elementary operations on NEFs as given in Letac and Mora (1990) and we then state a general transformation on variance functions as shown in Kokonendji and Seshadri (1994, Theorem 2.1). a) A ne transformation: Let F be a NEF and : x 7! a 1 x + a 0 with (a 1 ; a 0 ) 2. Then the image (F ) of the elements of F under is a NEF such that M (F ) = (M F ) and V (F ) (m) = a 2 1V F ((m a 0 )=a 1 ). b) Power of convolution: Let 1 2 M and F 1 = F ( 1 ). The set 1 (or F1 ) of reals t > 0 such that there exists a t (also denoted by t 1 ) in M with ( t ) = ( 1 ) and K t () = tk 1 (). Then, for t 2 F1, the NEF F t = F ( t ) generated by t is such that M Ft = tm F1 and V Ft (m) = tv F1 (m=t). Note that 1 [ f0g is an additive closed semigroup of [0; 1) and that the convolution t1 t2 equals t1 +t 2. Furthermore, f1; 2; g 1. In particular, if 1 = (0; 1), 1 (or F 1 ) is said to be in nitely divisible. For example, except for the binomial type, all other quadratic NEFs are in nitely divisible. Proposition 1 Let M 1 M, and be a map from M 1 to M such that for all 2 M 1, (()) = (). If we de ne the NEFs F = F () and F = F (()), and m = K 0 () ((m)), then we have V F (m) = V F (m)(dm=dm). 4

3 Lévy measures of NEFs The in nitely divisible measures 2 M with = (0; 1) are remarkably characterized by Letac (1992, page 12) by the following result. See also Seshadri (1993, Theorem 5.3). Proposition 2 Let be in M. Then is in nitely divisible if and only if there exists = () in M (or concentrated at one point) such that for all in () K() 00 = expfxg(dx): (3) In this case () = (), and if 0 2 (), the Lévy measure 0 corresponding to P ( 0 ; ) is 0 (dx) = x 2 expf 0 xg[(dx) (f0g) 0 (dx)]; (4) where 0 denotes the Dirac mass at 0. We here call the measure = (), de ned by (3), the modi ed Lévy measure of. The correspondence between = () and the Lévy measure = () is given by (4) for P ( 0 ; ). Note that if is a probability measure then 0 could be 0 in (4); and, in general, we assist to the exponential tilt of to get P ( 0 ; ). The Esscher transform is simply a tilting but on the level of (Lévy) processes. Now, we can carry on with the previous classi cation result of in nitely divisible measures in the context of NEF. Theorem 3 Let F be an in nitely divisible NEF generated by 2 M with variance function V F on M F. Assume that = () de ned by (3) is also in M and generates the NEF G. Then the variance function V G of G is such that for all m in M F V G VF 0 (m) = V F (m)vf 00 (m): (5) Moreover, if V F (m) m p for p 2 as m tends to in nity then V G (m) m 2 as m = VF 0 (m) tends to in nity. As observed Bar-Lev (1987) and Bar-Lev et al. (1992) that polynomial variance functions are the most important, we only state the following consequence for the behaviour of certains Lévy processes. Corollary 4 With the assumptions of Theorem 3, if V F (m) is a polynomial in m of degree p 2 then V G (m) is a polynomial in m of degree 2. Proof (of Theorem 3): From (3), the cumulant function of = () is written as K () = ln K 00 () = ln V F (m); 5

for all 2 () = () and m = K 0 () 2 M F. Since the mean of G is m = K() 0 = dm d d[ln V F (m)] dm = V 0 F (m) and therefore dm=dm = V 00 F (m), the formula (5) is easily deduced by using Proposition 1. Since a variance function is positive and real analytic on its domain, the assumption V F (m) m p for p 2 as m! 1 implies that, for F in nitely divisible, we have M F (r; 1) with r > 0. Hence, we obtain the desired result by (5) because we obviously have m = VF 0 (m) m p 1 and VF 00 (m) m p 2 as m! 1. 4 Stable processes We here interest to the positive -stable processes X = fx ;t ; t 0g, which are Lévy processes (of type 1) generated by probability measures ;t (dx) = dx x 1X k=1 ( 1) k! k (1 + k) 1 sin( k); x > 0; k! k t k ( 1)x where 0 < < 1 and t > 0; see Feller (1971) for basic properties. Note that for 2 [1; 2] it is de ned a family of (extreme) stable distributions concentrated on where special cases are Gaussian ( = 2) and Cauchy ( = 1) distributions. However, for = 0 we have the gamma distributions (of type 1) and for all < 0 that can be seen as a Poisson sum of gamma distributions (of type 2). Instead of the stability index, it is convenient to introduce the power parameter p, de ned by (p 1)(1 ) = 1; that clearly means p > 2 for 0 < < 1. The well-known special case of positive stable families is for = 1=2 or p = 3, with 1=2;t (dx) = dx p 2x 3 t expf t2 =(2x)g; x > 0; belongs to the family of inverse Gaussian distributions, and X is called an inverse Gaussian process (e.g. Seshadri, 1993). For a complete description of the stable (also called Tweedie or power) NEFs or processes with or p in, we can refer to Jørgensen (1997, Chapter 4). For xed p > 2 (or 0 < < 1) and t > 0, the in nitely divisible NEF F p;t = F ( ;t ) generated by ;t is such that ( ;t ) = ( 1; 0], K ;t () = 6

t( 1)[=( 1)] =, and V Fp;t (m) = m p t 1 p on M Fp;t = (0; 1). Thus, by Theorem 3, letting m t = pm p 1 t 1 p = V 0 F p;t (m) we easily have V Gp;t (m t ) = p 1 p m t 2 on M Gp;t = (0; 1); (6) where G p;t = F ( ;t ) is the NEF generated by the modi ed Lévy measure ;t = ( ;t ) and is of the gamma family for any p > 2 (or 0 < < 1). This is a classical result which is known of di erent manner and throught the density or the cumulant functions essentially (e.g. Küchler and Sørensen, 1997, Example 2.1.4; Letac, 1992, pages 13-15 or Seshadri, 1993, Example 5.10). Note that the characterization by variance function (6) can be easily extended to all p > 1 (or < 1). 5 Hinde-Demétrio processes The family of Hinde-Demétrio distributions has recently been introduced by Kokonendji et al. (2004); see also Kokonendji and Malouche (2004). It is known that these distributions do not generally have probability mass functions that can be written in closed form. But they are useful models and admitted a simple form of variance function. Let p > 1. Consider the NEF F p = F ( p ) generated by p 2 M such that its cumulant function is given by! K p () = e 1 2F 1 p 1 ; 1 p 1 ; p 1) ; e(p ; < 0; (7) p 1 where 2 F 1 (a; b; c; z) = 1 + ab c z + a(a+1)b(b+1) 1! c(c+1) z 2 + is the Gaussian hypergeo- 2! metric function (e.g. Johnson et al., 1992, pages 17-19). The Hinde-Demétrio family F p is in nitely divisible and is concentrated on the additive semigroup N + pn, for all p > 1. Its variance function is given by V Fp (m) = m + m p on M Fp = (0; 1). As particular cases, we have the negative binomial for p = 2 and the strict arcsine for p = 3 (Kokonendji and Khoudar, 2004; Letac and Mora, 1990). Now, by Theorem 3, putting m = 1 + pm p 1 = V 0 F p (m) we get V Gp (m) = p 1 p (m 1)(m 1 + p) on M G p = (1; 1); (8) where G p = F ( p ) is the NEF generated by the modi ed Lévy measure p = ( p ) and is of the negative binomial family for any p > 1. Note that the associated Lévy measures are of type 0. To de ne the Hinde-Demétrio processes associated to the NEF F p for xed 7

p > 1, we rst introduce the discrete random variable p taking its values on p () = 1 + (p 1)N = f1; p; 2p 1; 3p 2; g and such that its probability generating function is E(z p ) = z 2 F 1 1 ; 1 ; p ; 1 (qz)p p 1 p 1 p 1 2F 1 1 ; 1 p 1 p 1 ; p p 1 ; qp 1 ; (9) where 0 < q < 1 is a reparametrization of given in (7). Thus, the Hinde- Demétrio processes X p = fx p;t ; t 0g are compound Poisson processes N t X X p;t = p;i = p;1 + + p;nt, i=1 where N = fn t ; t 0g is a standard Poisson process with intensity > 0, independent of the random variables p;i ; i = 1; 2; ; (the sizes of jumps) that are independent and identically distributed as p, de ned above (9). ecall that the number N t of jumps in the time interval (0; t] (N 0 = 0) is Poisson with mean t. Hence, the distribution of X p;t belongs to the NEF F p;t = F ( t p ) generated by t p (i.e. tth power of convolution of p ). Besides, the compound Poisson process is completely characterized by knowledge of the (modi ed) Lévy measure; in particular, (8) provides a new characterization of Hinde- Demétrio processes. Note nally that, for particular cases p = 2; 3;, the Hinde-Demétrio processes X p can be used to modelling (overdispersion) count data, as for example the number of claims reported to an insurance company during a period of time (e.g. see Kokonendji and Malouche, 2004, for some references). eferences Bar-Lev, S.K. (1987), Discussion on paper by B. Jørgensen Exponential dispersion models, J. oy. Statist. Soc., Series B 49, 153-154. Bar-Lev, S.K., Bshouty, D. and Enis, P. (1992), On polynomial variance functions, Probab. Theory elat. Fields 94, 69-82. Bertoin, J. (1996), Lévy Processes, Cambridge University Press, Cambridge. Feller, W. (1971), An Introduction to Probability Theory and its Applications (Vol. II, 2nd ed., Wiley, New York). Johnson, N.L., Kotz, S. and Kemp, A.W. (1992), Univariate Discrete Distributions (2nd ed., John Wiley & Sons, New York). Jørgensen, B. (1997), The Theory of Dipersion Models (Chapman & Hall, London). Kokonendji, C.C., Demétrio, C. B. G. and Dossou-Gbété, S. (2004), Some discrete exponential dispersion models: Poisson-Tweedie and Hinde-Demétrio 8

classes, SOT (Statistics and Operations esearch Transactions) 28 (2), 201-214. Kokonendji, C.C. and Khoudar, M. (2004), On strict arcsine distribution, Commun. Statist.-Theory Meth. 33, 993-1006. Kokonendji, C.C. and Malouche, D. (2004), Selecting test of distribution in the Hinde-Demétrio family, Preprint LMA - Pau No. 0421 (submitted for publication to Ann. Statist.). Kokonendji, C.C. and Seshadri, V. (1994), The Lindsay transform of natural exponential families, Canadian J. Statist. 22 (2), 259-272 Kotz, S., Balakrishnan, N. and Johnson, N.L. (2000), Continuous Multivariate Distributions, Vol.1: Models and Applications (2nd ed., John Wiley & Sons, New York). Küchler, U. and Sørensen, M. (1997), Exponential Families of Stochastic Processes (Springer, New York). Letac, G. (1992), Lectures on Natural Exponential Families and their Variance Functions (IMPA, io de Janeiro). Letac, G. and Mora, M. (1990), Natural real exponential families with cubic variance functions, Ann. Statist. 18, 1-37. Morris, C.N. (1982), Natural exponential families with quadratic variance functions, Ann. Statist. 10, 65-80. Sato, K. (1999), Lévy Processes and In nitely Divisible Distributions (Cambridge University Press, Cambridge). Seshadri, V. (1993), The Inverse Gaussian Distribution: a Case Study in Exponential Families (Oxford University Press, New York). 9