Product Autoregressive Models with Log-Laplace and Double Pareto Log-normal Marginal Distributions

Size: px

Start display at page:

Download "Product Autoregressive Models with Log-Laplace and Double Pareto Log-normal Marginal Distributions"

Ashley Golden
5 years ago
Views:

1 CHAPTER 6 Product Autoregressive Models with Log-Laplace and Double Pareto Log-normal Marginal Distributions 6.1 Introduction The relationships between Laplace and log-laplace distributions and double Pareto lognormal and normal Laplace distributions are analogous to the relationship between the Normal and lognormal distributions. These models appeared in the statistical, economic as well as science literature over the past seventy years. Most often they appeared as models for data sets with particular properties or were derived as the most natural models based on the properties of the studied processes. Kozubowski and Podgórski (003) review many uses of the log-laplace distribution. The asymmetric log-laplace distribution Some results included in this chapter form part of the paper Jose and Manu (01). 11

2 has been fit to pharmacokinetic and particle size data. Particle size studies often show the log size to follow a tent-shaped distribution like the Laplace, see Julià and Vives-Rego (005) for more details. It has been used to model growth rates, stock prices, annual gross domestic production, interest and forex rates. Some explanation for the goodness of fit of the Log-Laplace has been suggested because of its relationship to Brownian motion stopped at a random exponential time. In this chapter we consider log-laplace distributions, double Pareto lognormal distributions and their multivariate extensions along with applications in time series modelling using product autoregression. Various divisibility properties like infinite divisibility and geometric infinite divisibility are studied. Multiplicative infinite divisibility and geometric multiplicative infinite divisibility are introduced and studied. Additive autoregressive models with these marginal distribution are also developed. The generation of the process, sample path properties and estimation of parameters are considered. 6. The log-laplace distribution and its properties Symmetric and asymmetric forms of log-laplace distribution were used for modelling various phenomena by a number of researchers. Inoue (1978) derived the symmetric log- Laplace distribution from his stochastic model for income distribution, fitted it to income data by maximum likelihood and reported a better fit than that of a lognormal model traditionally used in this area. Uppuluri (1981) obtained an axiomatic characterization of this distribution and derived the distribution from a set of properties about the dose-response curve for radiation carcinogenesis. Barndorff-Nielsen (1977) and Bagnold and Barndorff- Nielsen (1980) proposed the log-hyperbolic models, of which log-laplace is a limiting case for particle size data. Log-Laplace models have been recently proposed for growth rates of diverse processes such as annual gross domestic product, stock prices, interest or foreign currency exchange rates, company sizes, and other processes. Log-Laplace distributions are mixtures of lognormal distributions and have asymptotically linear tails. These two features makes them particularly suitable for modelling size data. 113

3 Figure 6.1: (a) β = 1.3, δ = 1 (b) β = 1, δ = 1 Definition A random variable Y is said to have a log-laplace distribution with parameters δ > 0, α > 0, and β > 0 (LL(δ, α, β)) if its probability density function (p.d.f.) is g(y) = 1 δ αβ α + β The cumulative density function has the form G(y) = αβ α+β 1 β α+β ( y ) β 1 δ ( α+1 for 0 < y < δ δ y) for y δ. (6..1) 0 for x < 0 ( y ) β for 0 < y δ. (6..) δ( ) α for y δ δ y This distribution can be derived by combining the two power laws and has power tails at zero and at infinity. This density has a distinct tent shape when plotted on the log-log scale. The graphs of log-laplace distribution for fixed β and for various values of α are given in Figure 6.1 The log-laplace pdf (6..1) can be derived as the distribution of e X where X is an 114

4 asymmetric Laplace (AL) variable with density f(x) = αβ α + β exp( α(x θ)) for x θ exp(β(x θ)) for x < θ. (6..3) Therefore if X has an AL distribution given by (6..3), then the density of Y = e X is given by (6..1) with δ = e θ. Kozubowski and Podgórski (003) studied some important properties of LL(δ, α, β). It has Pareto-type tails at zero and infinity, that is P (Y > x) C 1 x α as x and P (0 < Y x) C x β as x 0 +. It also possesses invariance property with respect to scaling and exponentiation which is natural property of variables describing multiplicative processes such as growth. The distribution has a representation as an exponential growth-decay process over random exponential time which extends a similar property of the Pareto distribution by allowing decay in addition to growth. Its simplicity allows for efficient practical applications and thus gives an advantage over many other models for heavy power tails, such as stable or geometric stable laws. The upper tail index is not bounded from above which adds flexibility over some other models for heavy tail data such as stable or geometric stable laws where its value is limited by two. Maximum entropy property of LL distribution is desirable in many applications. Stability with respect to geometric multiplication which may play a fundamental role in modelling growth rates. Limiting distribution of geometric products of LL random variables leads to useful approximations. Its straightforward extension to the multivariate setting allows modelling of correlated multivariate rate data, such as joint returns on portfolios of securities. 115

5 Let Y d = LL(δ, α, β), and let c > 0, r 0, then cy d = LL(cδ, α, β), Y r = d LL(δ r, α/r, β/r) for r > 0 LL(δ r, β/ r, α/ r ) for r < 0. In particular, if Y reciprocal property 1 Y d =LL(δ, α, β) then 1 Y d = Y. d =LL(1/δ, β, α), and if α = β, we have the The characteristic function of the LL(1, α, β) random variable Y has the form E(e ity ) = where M(a, b, z) = 1+ α M(β, β + 1, it) + αβ α + β α + β tα [C(t, α) + is(t, α)], (6..4) (a) n z n n=1 is the confluent hypergeometric function and C(x, a) = S(x, a) = x (b) n n!, for a > 0, b > 0, z C, (a) n = a(a+1) (a+n 1) t a 1 sin tdt are the generalized Fresnel integrals. x t a 1 cos tdt and LL distributions are heavy tailed and some moments do not exist.the mean and the variance are finite only if α > 1 and α > respectively. Due to reciprocal properties of these laws, the harmonic mean is of the same form as the reciprocal of the mean. We also note that LL distributions are unimodal with the mode at δ when β > 1 and the mode at zero when β < 1. αβ Mean = δ (α 1)(β + 1), α > 1 r th moment = δ r αβ (α r)(β + r), β < r < α ( [ ] ) Variance = δ αβ (α )(β + ) αβ, α >. (α 1)(β + 1) Log-Laplace distributions can be represented in terms of other well-known distribu- 116

6 tions, including the lognormal, exponential, uniform, Pareto and beta distributions. The log-laplace distribution LL(δ, α, β) can be viewed as a lognormal distribution LN(µ, σ), where the parameters µ and σ are random. More specifically, the variable Y LL(δ, α, β) has the representation Y d = e µ R σ, where R is standard lognormal random variable, µ = log δ + σ = ( 1 α 1 ) E and β E αβ, where E is a standard exponential variable independent of R. As a direct consequence of the fact that a skew Laplace variable arises as a difference of two independent exponential variables. We have Y d = δe 1 α E 1 1 β E, where E 1 and E are two i.i.d. standard exponential variables. Let U 1 and U be independent random variables distributed uniformly on [0, 1]. Then we have Y = d δ U 1/β 1 U 1/α LL random variable can also be represented as the ratio of two Pareto random variables.. Y d = δ P 1 P, where P 1 and P are independent Pareto random variables with parameters α and β respectively, for more details, see Kozubowski and Podgórski (003). 117

7 Entropy, basic concept in information theory is a measure of uncertainty associated with the distribution of a random variable Y is defined as H(Y ) = E[ log f(y )]. It has found applications in a variety of fields, including statistical mechanics, queuing theory, stock market analysis, image analysis and reliability. The entropy of Y LL(δ, α, β) is given by H(Y ) = 1 + log δ + 1 α 1 β + log ( 1 α + 1 β For an AL random variable X, entropy is given by ( 1 H(X) = log α + 1 ). β The entropy is maximized for an AL distribution and hence the same property holds for LL distribution also. Jose and Naik (008) introduced an asymmetric pathway distributions and showed that the model maximizes various entropies. The estimation of parameters of the log-laplace distribution is given by Hinkley and Revankar (1977). They given Fisher information matrix of the LL random variable. The maximum likelihood estimates of the parameters are given by Hartley and Revankar (1974). They showed that these estimators are asymptotically normal and efficient Multivariate Extension Let X = (X 1,..., X d ) have a multivariate asymmetric Laplace distribution with characteristic function ψ(t) = ). [ t Σt im t] 1, (6..5) where t denotes transpose of t, m R d and Σ is a d d non-negative definite symmetric matrix. A d-dimensional log-laplace variable can be defined as a random vector of the 118

8 form Y = e X = (e X 1,..., e X d ). If Σ is positive-definite, then the distribution is d-dimensional and the corresponding density function can be derived easily from that of the Laplace distribution, see Kotz et al. (001). ( d ) 1 g(y) = f(log y) y i, y = (y 1,..., y d ) > 0, where log y = (log y 1,..., log y d ) is defined componentwise and f(x) = i=1 ( ) Σ 1 m ex x Σ 1 ν/ x ( K (π) d/ Σ 1/ + m Σ 1 ν ( + m Σ m 1 m)(x Σ x)) 1, x 0 is the density of multivariate asymmetric Laplace distribution. Here ν = 1 d/ and K ν is the modified Bessel function of the third kind. Similar to the univariate case, multivariate LL distributions also possesses the stability and limiting properties with respect to geometric multiplication. multivariate LL random vector is univariate LL. 6.. Divisibility properties Each component of a Klebanov et al. (1984) introduced geometric infinite divisibility (g.i.d.) and obtained several characterizations in terms of characteristic functions. The class of g.i.d. distributions form a subclass of infinitely divisible (i.d.) distributions and contain the class of distributions with complete monotone derivative (c.m.d.). They also introduced and characterized the related concept of geometric strict stability (g.s.s.) for real valued random variables. The exponential and geometric distributions are examples of distributions that possess the g.i.d. and the g.s.s. properties. Mittag-Leffler distributions, Laplace distributions etc are g.i.d., see Pillai (1990), Pillai and Sandhya (1990). Fujitha (1993) constructed a larger class of g.i.d. distributions with support on the non-negative half-line. Bondesson (1979) and Shanbhag and Sreehari (1977) have established the self-decomposability of many of the most com- 119

9 monly occurring distributions in practice. Kozubowski and Podgórski (010) introduced a notion of random self-decomposability and discussed its relation to the concepts of selfdecomposability and g.i.d.. The notions of infinite divisibility (i.d.) and geometric infinite divisibility (g.i.d.) play a fundamental role in the study of central limit theorem and Lévy processes. Variables appearing in many applications in various sciences can often be represented as sums of larger number of tiny variables, often i.i.d. The theory of infinite divisible distributions developed primarily during the period from 190 to Remark AL distributions are i.d.. Remark 6... LL distributions are not i.d.. Definition 6... A random variable X and its probability distribution is said to be g.i.d. if for any p (0, 1) it satisfies the relation X d = ν p i=1 X (i) p, where ν p is a geometric random variable with mean 1/p, the random variables X (i) p i.i.d. for each p, and ν p and (X p (i) ) are independently distributed, see Klebanov et al. (1984). are Characterization of g.i.d.. A random variable X is g.i.d. if and only if (iff) ϕ X (t) = ψ(t), where ψ(t) is a non-negative function with complete monotone derivative (c.m.d.) and ψ(0) = 0. Now we consider the divisibility properties with respect to multiplication. Kozubowski and Podgórski (003) discussed the multiplicative divisibility and geometric divisibility. There is no further developments on this area in the literature. 10

10 Definition A random variable Y is said to be multiplicative infinitely divisible (m.i.d.) if it has the representation Y d = n Y i, n = 1,, 3,..., i=1 for some i.i.d. random variables Y i. Theorem LL distributions are m.i.d.. Proof. Let Y LL. We have to prove that Y d = n Y i, i=1 where Y i s are i.i.d. random variables. Taking logarithm on both sides we get log Y = n i=1 log Y i. Since we know that X = log Y AL distributions, we need only to prove that X is i.d.. AL distributions are i.d.. Therefore it follows that LL distributions are m.i.d.. Definition A random variable Y is said to be geometric multiplicative infinitely divisible (g.m.i.d.) if for any p (0, 1), it satisfies the relation Y d = ν p i=1 Y (i) p, where ν p is a geometric random variable with mean 1/p, the random variables Y (i) p i.i.d. for each p, and ν p and (Y p (i) ) are independently distributed. are Characterization of g.i.d.. A random variable X is g.m.i.d. if and only if ϕ log X (t) = ψ(t), 11

11 where ψ(t) is a non-negative function with complete monotone derivative (c.m.d.) and ψ(0) = 0. Theorem 6... LL distributions are g.m.i.d.. The log-laplace laws arise as limits of the products Y 1 Y... Y νp of i.i.d. random variables with geometric number of terms. As Laplace distributions are limits of sums of random variables X 1 + X + + X νp with a geometric number of terms Product Autoregression McKenzie (198) introduced a product autoregression structure. A product autoregression structure of order one (PAR(1)) has the form Y n = Y a n 1ε n, 0 < a 1, n = 0, ±1, ±,..., (6..6) where {Y n } and {ε n } are sequence of positive random variables and they are independently distributed. In the usual non-linear autoregressive models, we have an additive noise. But in product autoregressive models, we have a non-additive noise. We may determine the correlation structure as follows. Assuming stationarity, Y n = Y a = E (Y n Y n k ) = n 1ε n { k 1 ε ai n i i=0 { k 1 i=0 } Y ak n k. ( E ε ai)} ( ) E Yn k ak. 1

12 From (6..6), E(Y s ) = E(Y as )E(ε s ) and therefore E (Y n Y n k ) = { } k 1 E(Y ai ) E(Y ak +1 ) E(Y ai +1 ) i=0 = E(Y )E(Y ak +1 ). (6..7) E(Y ak ) Now consider the autocorrelation function ρ Y (k) = Corr(Y n, Y n k ), when Y has a log-laplace distribution. ρ Y (k) = (α )(β + ) (α s 1)(β + s + 1) [ ] (α s)(β + s)(α 1)(β + 1) αβ(α s 1)(β + s + 1) (α 1) (β + 1), α >.(6..8) αβ(α )(β + ) The usual additive first order autoregressive model is given by X n = ax n 1 + ɛ n, 0 < a < 1, n = 0, ±1, ±,..., (6..9) where {ɛ n } is the innovation sequence of independent and identically distributed random variables. Its autocorrelation function is given by ρ X (k) = a k, k = 0, ±1, ±,.... From (6..8), it is clear that the correlation structure is not preserved in the case of log-laplace distribution. It is well known that the correlation structure is not preserved in going from the lognormal to the normal distributions. McKenzie (198) showed that the gamma distribution is the only one for which the PAR(1) process has the Markov correlation structure Self-decomposability The basic problem in time series analysis is to find the distribution of {ε n }. The class of s.d. distributions form a subset of the class of infinitely divisible distributions and they include the stable distributions as proper subset. A number of authors have examined the L-class in detail and many of its members are now well known. Definition (Kozubowski and Podgórski (010)) A distribution with charac- 13

13 teristic function ϕ is randomly self-decomposable (r.s.d.) if for each p, c [0, 1] there exists a probability distribution with characteristic function ϕ c,p satisfying ϕ(t) = ϕ c,p (t)[p + (1 p)ϕ(ct)]. Definition A characteristic function ϕ is multiplicative self-decomposable (m.s.d.) if for every 0 < a < 1, there exists a characteristic function ϕ log a such that ϕ log X (t) = ϕ log X (at)ϕ log a (t), t R. The distributions in the L-class are several which are the distributions of the natural logarithms of random variables whose distributions are also self-decomposable. These include the normal, the log gamma and the log F distributions. This phenomenon is very interesting in a time series point of view because the logarithmic transformation is the commonest of all transformations used in time series analysis Autoregressive model If we take logarithms of Y n in (6..6), let X n = log Y n, then the stationary process of {X n } has the form X n = ax n 1 + η n, where η n = log ε n, (6..10) which has the form of linear additive autoregressive model of order one. Then we can proceed as in the case of AR(1) processes. Now from (6..10), under the assumption of stationarity, we can obtain the characteristic function of η as φ η (t) = φ X(t) φ X (at). (6..11) We know that X follows an AL distribution with characteristic function, φ(t) = e iθt (1 + 1 σ t iµt), < t <, σ > 0, < µ <. 14

14 This characteristic function can be factored as ( ) ( ) φ(t) = e iθt i σ κt 1 i σ, (6..1) κ t where κ > 0, µ = σ ( 1 κ κ), see Kotz et al. (001). Then substituting (6..1) in (6..11), we get φ η (t) = eiθt (1 + i σ κat) e iθat (1 + i σ κt) (1 i σ κ at) (1 i σ κ t) ] [ = e [a iθ(1 a)t (1 a) (1 + i σ a + (1 a) κt) (1 i σ κ t) ].(6..13) This implies that η has a convolution structure of the following form. η d = U + V 1 V, (6..14) where U is a degenerate random variable taking value θ(1 a) with probability one and V 1 and V are convolutions of T 1 and T where, 0, with probability a T 1 = E 1, with probability 1 a, 0, with probability a T = E, with probability 1 a, where E 1 and E are exponential random variables with means 6..6 Sample path properties σ κ and σ κ respectively. Sample path properties of the process are studied by generating 100 observations each from the process with various parameter (θ, κ, σ) combinations. In Figure 6., we take 15

15 (a) (b) Figure 6.: (a) Sample path of the process for a=0.70 and (θ, κ, σ) = (0, 1, 1) (b) Sample path of the process for a=0.70 and (θ, κ, σ) =(0, 10, 10) a = 0.7 and the values of (θ, κ, σ) as (0, 1, 1) and (0, 10, 10) respectively. In Figure 6.3, we take a = 0.4 and the values of (θ, κ, σ) as (0, 1, 1) and (0,, ) respectively. The process exhibits both positive and negative values with upward as well as downward runs as seen from the figures Estimation of parameters The moments and cumulants of the sequence of innovations {η n } can be obtained directly from (6.3.6) as E(η n ) = (1 a) (θ + σ ( )) ( ( )) 1κ σ κ, Var(η n ) = (1 a 1 ) κ + κ and k n = ( (n 1)!(1 a n ) ( ) n ( (n 1)!(1 a n ) σ 1 Since the mean and variance of AL distribution are E(X) = θ + σ ( 1 κ κ ) κ n) κ ) n n ( σ 1 + κ n) κ n and Var(X) = σ if n > 1 is odd if n is even ( ) 1 κ + κ. 16

16 (a) (b) Figure 6.3: (a) Sample path of the process for a=0.70 and (θ, κ, σ) = (0, 1, 1) (b) Sample path of the process for a=0.40 and (θ, κ, σ) =(0,, ) and the higher order cumulants are given by k n = ( (n 1)! ( ) n ( (n 1)! σ 1 κ n) κ ) n n ( σ 1 + κ n) κ n if n > 1 is odd if n is even. From the cumulants the higher order moments can be obtained easily since k 3 = µ 3, k 4 = µ 4 3µ and k 5 = µ 5 10µ µ 3. Hence the problem estimation of parameters of the process can be tackled in a way similar to the method of moments Multivariate product autoregression A multivariate product autoregression structure of order one (PAR(1)) has the form Y n = Y a n 1ε n, 0 < a 1, n = 0, ±1, ±,..., (6..15) where {X n } and {ε n } are sequence of positive d-variate random vectors and they are independently distributed.here also we have a non-additive noise. For further analysis, we can take logarithms of Y n in (6..15), let X n = log Y n. Then 17

17 obtain a multivariate linear AR(1) model, X n = ax n 1 + η n, 0 < a < 1, (6..16) where X n and innovations η n = log ε n are d- variate random vectors. Clearly we know that X n follows a multivariate asymmetric Laplace distribution having the characteristic function given in (6..5). Then the characteristic function of η n can be obtained from ψ η (t) = ψ X(t) ψ X (at), where ψ X (t) = E(exp it X). By inverting the characteristic function, we can obtain the density function of η. 6.3 The double Pareto lognormal distribution and its properties We know that the logarithmic scaling in the normal distribution results in the lognormal distribution. Logarithmic scaling in the Laplace distribution leads to log-laplace distribution, which is also known as double-pareto distribution. The double Pareto lognormal (DPLN) distribution is an exponentiated version of normal Laplace random variable. A double Pareto lognormal random variable can be defined using a geometric Brownian motion (GBM): dx = µxdt + σxdw where X(t) denotes an individual s income at time t, µ and σ are mean drift and variance parameters and dw is white noise. Suppose that the distribution of starting incomes, say X 0 (t) at time t is distributed lognormally, so that ln X 0 N(ν, ). After T time units the state X(T ) will also be lognormally distributed so that ln X(T ) N(ν +(µ σ /)T, +σ T ). To find the distribution of ˆX, it is easiest to work in the logarithmic scale. Let Y = ln X (so that Y is the state of an ordinary Brownian motion after an exponentially distributed time). The distribution of Y can easily be shown (see, Reed (003)) to be that of the sum of independent random variables W 18

18 and Z, where Z follows a normal distribution with parameters ν and and W follows an asymmetric Laplace distribution with parameters λ 1 and λ. The p.d.f. of Y is given by g Y (y) = λ 1λ λ 1 + λ φ where R(z) = 1 Φ(z) φ(z) ( y ν ) [ ( R λ 1 y ν ) ( + R λ + y ν )], (6.3.1) is the Mills ratio where Φ(z) and φ(z) are the cumulative distribution function (cdf) and the pdf of standard normal distribution. The distribution of Y is known as Normal-Laplace distribution (NL(λ 1, λ, ν, )). A normal Laplace (NL(λ 1, λ, ν, )) random variable has the characteristic function, which is the product of the characteristic functions of its normal and Laplace components (see, Reed and Jorgensen (004)), φ Y (t) = [ exp (iνt )] ( ) λ 1 λ t (λ 1 it)(λ + it) (6.3.) Reed and Jorgensen (004) established that the normal Laplace distribution is infinitely divisible. Expanding the cumulant generating function of normal-laplace distribution, it is easy to show that E(Y ) = ν + 1 λ 1 1 λ and V ar(y ) = + 1 λ (6.3.3) λ The third and fourth cumulants are given by κ 3 = λ 3 1 λ 3 and κ 4 = 6 λ (6.3.4) λ 4 Now the pdf of X can be easily obtained from (6.3.1). It can be expressed in terms of Mills ratio as f(x) = 1 g(ln x). x 19

19 A random variable X is said to have a DPLN distribution if its pdf is f X (x) = [ ) ( ) λ 1 λ exp (λ 1 ν + λ 1 ln x ν x λ1 1 λ1 Φ λ 1 + λ ) ( )] +x λ 1 exp ( λ ν + λ 1 ln x ν + Φ c λ, where Φ is the cdf and Φ c is the complementary cdf of N(0, 1). We can write X DPLN(λ 1, λ, ν, ) to denote a random variable follows double Pareto lognormal distribution. The DPLN distribution and log-laplace distribution share some properties. Similar to the log-laplace distributions, the DPLN distribution can be represented as a continuous mixture of lognormal distributions with different variances. A DPLN(λ 1, λ, ν, ) random variable can be expressed as X d = U V 1 V, where U, V 1 and V are independent. U is lognormally distributed and V 1 and V are Pareto random variables with parameters λ 1 and λ respectively. We can express X also as X d = UQ, where Q is the ratio of the Pareto random variables, known as double Pareto random variable, which has the pdf f(q) = λ 1 λ λ 1 + λ q λ 1, for 0 < q 1 λ 1 λ λ 1 + λ q λ 1 1, for q > 1 The DPLN(λ 1, λ, ν, ) can be represented as a mixture of the form. f(x) = λ λ 1 + λ f 1 (x) + λ 1 λ 1 + λ f (x), 130

20 where and ) ( ) f 1 (x) = λ 1 x λ1 1 exp (λ 1 ν + λ 1 ln x ν λ1 Φ ) ( ) f (x) = λ x λ 1 exp ( λ ν + λ 1 ln x ν + Φ c λ, which are the limiting forms when λ and λ 1 of the DPLN(λ 1, λ, ν, ). For details, see Reed Jorgensen (004). The cumulative distribution function of DPLN(λ 1, λ, ν, ) can be written as ( ) [ ) ( ) ln x ν 1 F (x) = Φ λ x λ 1 exp (λ 1 ν + λ 1 ln x ν λ1 Φ λ 1 + λ ) ( )] + λ 1 x λ exp ( λ ν + λ 1 ln x ν + Φ c λ. Like the log-laplace distribution, This distribution also exhibits power law behavior in both tails for values near zero and large positive values. i.e. λ f(x) k 1 x 1 1 when x f(x) k x λ 1 when x 0, ) ) where k 1 = λ 1 exp (λ 1 ν + λ 1 and k = λ exp ( λ ν + λ 1. The parameters of the DPLN distribution include a lognormal mean (ν) and variance ( ) parameter which control the location and spread of the body of the distribution, and power-law scaling exponents for the left (λ ) and right (λ 1 ) tails. Special cases of the DPLN include the right Pareto lognormal (RPLN) distribution, with a power law tail on the right but not the left side (λ ); the left Pareto lognormal (LPLN) distribution, with a powerlaw tail only near zero (λ 1 ); and the lognormal distribution, with no power-law tails (λ 1, λ ). The DPLN(λ 1, λ, ν, ) pdf is unimodal if λ > 1 and is monotonically decreasing 131

21 when 0 < λ < 1. The moment generating function does not exist for a DPLN distribution. The lower order moments about zero are given by ) µ r = E(X r λ 1 λ ) = (rν (λ 1 r)(λ + r) exp + r for r < λ 1. µ r does not exist for r λ 1. The mean (for λ 1 > 1) is and the variance (for λ 1 > ) is V ar(x) = E(X) = λ 1 λ (λ 1 1)(λ + 1) eν+ [ ] λ 1 λ e ν+ (λ1 1) (λ + 1) (λ 1 1) (λ + 1) (λ 1 )(λ + ) e λ 1 λ. The DPLN family of distributions is closed under power-law transformation. If X DPLN(λ 1, λ, ν, ), then for constants a > 0, b > 0, W = ax b will also follow a DPLN distribution. In other words, W DPLN(λ 1 /b, λ /b, bν + ln a, b ). Various applications of DPLN distribution are discussed by Reed and Jorgensen (004). Reed (003) showed the distribution of income sizes at a point in time to be the product of independent double Pareto and log normal components. This distribution fits the distribution of size of cities within a particular country very well, since it has the power-law behavior in both tails (see, Reed (00)). It also fits the grouped data on aeolian sand particle size and diamond particle size given in Barndorff-Nielsen (1977). The similarity between biological and language evolution has attracted the interest of researchers familiarized to analyze genetic properties in biological populations with the aim of describing problems of linguistics. Languages continuously evolve, changing, for example, their lexicon, phonetic and grammatical structure. This evolution is similar to the evolution of species driven by mutations and natural selection. Schwämmle et al. (009) show that the distribution of language sizes in terms of speaker population can be well modelled by the DPLN distribu- 13

22 tion. For the Internet, many studies have shown that traffic patterns in the Internet appear to have self-similarity. This self-similarity can possibly be explained if the underlying distribution of file sizes obeys an appropriate power law. Ramírez et al. (008) assumed that claim sizes in insurance contexts can be modelled as DPLN random variables. Mitzenmacher (004) introduced and analyzed a new dynamic generative user model to explain the behavior of web file size distributions. The file sizes tend to have a lognormal body but a Pareto tail. It is also showed that DPLN gives a good fit for the file size distributions. Seshadri et al. (008) analyzed a massive social network gathered from the records of a large mobile phone operator with more than a million users and tens of millions of calls. They examined the distributions of the number of phone calls per customer, the total talk minutes per customer and the distinct number of calling partners per customer and found better fits using the DPLN distribution Product Autoregression Let us consider the PAR(1) structure with DPLN marginal distribution. The autocorrelation function ρ X (k) = Corr(X n, X n k ) has the form, ρ X (k) = e ak (λ 1 a k )(λ +a k )(λ 1 1)(λ +1) (λ 1 a k 1)(λ λ +a k +1) 1 λ. (6.3.5) e (λ 1 1) (λ +1) (λ 1 )(λ λ +) 1 λ From(6.3.5), it is clear that the correlation structure is not preserved in the case of DPLN distribution also Autoregressive model If we take logarithms of Y n in (6..6), let X n = ln Y n, then the stationary process of {Y n } has the form X n = ax n 1 + η n, where η n = ln ε n, (6.3.6) which has the form of linear additive autoregressive model of order one. Then we can proceed as in the case of AR(1) processes. Now from (6.3.6), we can obtain the characteristic 133

23 function of η as φ η (t) = φ Y (t) φ Y (at). We know Y has a normal Laplace distribution (NL(λ 1, λ, ν, )). Then we can write the above expression as Also φ η (t) = [ ( exp iν(1 a)t (1 a ) t )] [ (λ1 iat) (λ 1 it) ] (λ + iat). (6.3.7) (λ + it) λ 1 iat λ 1 it λ iat λ it λ 1 = a + (1 a) λ 1 it λ = a + (1 a) λ it. Gaver and Lewis (1980) obtained the innovation distribution of the EAR(1) process as the exponentially tailed distribution with parameters a and λ 1 (ET(a, λ 1 )). Hence the innovation sequence {η n } can be treated as a sequence of random variables of the form η d = Z + E 1 E, (6.3.8) where Z N(ν(1 a), (1 a )), E1 ET (a, λ 1 ) and E ET (a, λ ) Estimation of parameters Reed and Jorgensen (004) discussed the method of moments and maximum likelihood estimation procedures for the DPLN distribution. Assuming that the data is from the DPLN distribution, one can obtain method of moments estimates (MMEs) of λ 1, λ, ν and using the first four moments of either the DPLN distribution or having the first log-transformed data, of the NL distribution. The estimates are not the same. Use of the DPLN moments however is not recommended, since population moments of order one or more do not exist. To find MMEs of λ 1 and λ using the NL distribution, one needs only to solve (6.3.4), 134

24 with κ 3 and κ 4 set to their sample equivalents. Estimates of ν and can then be obtained from (6.3.3). Reed and Jorgensen (004) show that estimates can be obtained by maximum likelihood as (6.3.4) has no real solution and recommend the use of the method of moments only for finding starting values for iterative procedures for finding maximum likelihood estimates. In their absence trial and error method can be used. Unlike MMEs, maximum likelihood estimates (MLEs) are the same whether one fits the DPLN to data x 1, x,..., x n or fits the NL to y 1 = ln x 1,..., y n = ln x n. The log likelihood function is L = n ln λ 1 + n ln λ n ln(λ 1 + λ ) n ( ) yi ν n [ ( + φ + ln R λ 1 y ) i ν i=1 i=1 This can be maximized analytically over ν to yield ( + R λ + y )] i ν. ˆν = ȳ 1 λ λ. Then ˆL(λ 1, λ, ) = ln λ 1 + n ln λ n ln(λ 1 + λ ) + + [ ( n ln R i=1 λ 1 y i ȳ 1 λ λ ( ) n yi ȳ 1 λ φ λ ) ( i=1 + R λ + y i ȳ 1 λ λ This can be maximized numerically using the MMEs as starting values. Reed and Jorgensen (004) also describe the EM algorithm which uses a likelihood function based on augmented data as a stepping stone towards the maximization of the likelihood based on the observed data. Ramírez et al. (008) described a method for carrying out Bayesian inference for the DPLN distribution and have illustrated that this model can capture both the heavy )]. 135

25 tail behavior and also the body of the distribution for real data examples. They applied the Bayesian approach to inference for the DPLN M 1 and M DPLN 1 queueing systems where the inter-arrival and service times respectively are modelled using the DPLN distribution. Since the DPLN distribution does not possess a moment generating function or Laplace transform in closed form as most of heavy-tailed distributions, it is impossible to apply the usual queueing theory techniques to obtain the equilibrium distributions of the DPLN M 1 and M DPLN 1 queueing systems using standard techniques. Ramírez et al. (008) applied a direct approximation of the non-analytical Laplace transform using a variant of the transform approximation method (TAM). They have applied TAM to estimate the Laplace transform of the DPLN distribution and the waiting time distribution in the M DPLN 1 system. By combining TAM with Bayesian inference for the DPLN distribution, they obtained numerical predictions of queueing properties such as the probability of congestion. They have illustrated this methodology with real data sets, estimating first waiting times and congestion in internet and computing the probability of ruin in the insurance context, making use of the duality between queues and risk theory. 6.4 Conclusion The log-laplace distribution and its important properties and its extension to multivariate case are studied. The double Pareto lognormal distributions also studied. Some divisibility properties like infinite divisibility, geometric infinite disability and divisibility properties with respect to multiplication, namely multiplicative infinite divisibility, geometric multiplicative infinite divisibility properties are explored. Product autoregression structures with log- Laplace and double Pareto lognormal maraginals are developed. Self-decomposability property is studied. A linear AR(1) model is developed along with the sample path properties and the estimation of parameters of the process. A multivariate extension of the product autoregression structure is also considered. 136

26 References Bagnold, R.A., Barndorff-Nielsen, O. (1980). Sedimentology 7, The pattern of natural size distribution, Barndorff-Nielsen, O. (1977). Exponentially decreasing distributions for the logarithm of particle size, Proceedings of the Royal Society of London A 353, Bondesson, L. (1979). A general result on infinite-divisibility, Annals of Probability 7, Fujitha, Y. (1993). A generalization of the results of Pillai, Annals of the Institute of Statistical Mathematics 45, Hartley, M.J., Revankar, N.S. (1974). On the estimation of the Pareto law from underreported data, Journal of Econometrics, Hinkley, D.V., Revankar, N.S. (1977). Estimation of the Pareto law from underreported data, Journal Econometrics 5, Inoue, T. (1978). On Income Distribution: The Welfare Implications of the General Equilibrium Model, and the Stochastic Processes of Income Distribution Formation, Ph.D. Thesis, University of Minnesota. Jose, K.K., Naik, S.R. (008). A class of asymmetric pathway distributions and an entropy interpretation, Physica A: Statistical Mechanics and its Applications 387(8), Jose, K.K., Manu, M.T. (01). A Product Autoregressive Model with Log-Laplace Marginal Distribution, Statistica, anno LXXII, n.3, Julià, O., Vives-rego, J. (005). Skew-Laplace distribution in Gram-negative bacterial axenic cultures: new insights into intrinsic cellular heterogeneity, Microbiology 151,

27 Klebanov, L.B., Maniya, G.M., Melamed, I.A. (1984). A problem of Zolotarev and analogs of infinitely divisible and stable distribution in a scheme for summing a random number of random variables, Theory of Probabability and its Applications 9, Kozubowski, T., Podgórski, K. (003). Log-Laplace distributions, International Journal of Mathematics 3 (4), Kozubowski, T., Podgórski, K. (010). Random self-decomposability and autoregressive processes, Statistics and Probability Letters 80, Kotz, S., Kozubowski, T.J., Podgórski, K. (001). The Laplace Distribution and Generalizations - A Revisit with Applications to Communications, Economics, Engineering and Finance, Birkhäuser, Boston. McKenzie, ED. (198). Product autoregression: a time-series characterization of the gamma distribution, Journal of Applied Probability 19, Mitzenmacher, M. (004). Dynamic models for file sizes and double Pareto distributions, Internet Mathematics 1(3), Pillai, R.N. (1990). Harmonic mixtures and geometric infinite divisibility, Journal of Indian Statistical Association 8, Pillai, R.N., Sandhya, E. (1990). Distributions with complete monotone derivative and geometric infinite divisibility, Advances in Applied Probability, Ramírez, P., Lillo, R.E., Wiper, M.P., Wilson, S.P. (008). Inference for double Pareto lognormal queues with applications, Working Paper 08-04, Statistics and Econometrics Series 0, Universidad Carlos III de Madrid, Spain. Reed, W.J., Jorgensen, M. (004). The double Pareto lognormal distribution-a new parametric model for size distributions, Communications in Statistics - Theory and Methods 33(8),

28 Reed, W.J. (003). The Pareto law of incomes- an explanation and an extension, Physica A 319, Reed, W.J. (00). On the rank-size distribution for human settlements, Journal of Regional Science 4, Schwämmle, V., Queirós, S.M.D., Brigatti, E., Tchumatchenko, T. (009). Competition and fragmentation: a simple model generating lognormal-like distributions, New Journal of Physics 11, Seshadri, M., Machiraju, S., Sridharan, A., Bolot, J., Faloutos, C., Leskovec, J. (008). Mobile call graphs: beyond power law and lognormal distributions, preprint. Shanbhag, D.N., Sreehari, M. (1977). On certain self-decomposable distributions, Zeitschrift für Warhscheinlichkeitstheorie und Verwandte Gebiete 38, 17. Uppuluri, V.R.R. (1981). Some properties of log-laplace distribution, Statistical Distributions in Scientific Work 4, (eds., G.P. Patil, C. Taillie and B. Baldessari), Dordrecht: Reidel,

Multivariate Normal-Laplace Distribution and Processes

CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by