arxiv: v3 [stat.co] 22 Jun 2017

Similar documents
arxiv: v4 [stat.co] 18 Mar 2018

Bayesian analysis of three parameter absolute. continuous Marshall-Olkin bivariate Pareto distribution

Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions

On Sarhan-Balakrishnan Bivariate Distribution

Bivariate Geometric (Maximum) Generalized Exponential Distribution

Estimation of the Bivariate Generalized. Lomax Distribution Parameters. Based on Censored Samples

Generalized Exponential Geometric Extreme Distribution

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

On Five Parameter Beta Lomax Distribution

A New Two Sample Type-II Progressive Censoring Scheme

Univariate and Bivariate Geometric Discrete Generalized Exponential Distributions

On Weighted Exponential Distribution and its Length Biased Version

Some conditional extremes of a Markov chain

Hybrid Censoring; An Introduction 2

Weighted Marshall-Olkin Bivariate Exponential Distribution

A New Class of Positively Quadrant Dependent Bivariate Distributions with Pareto

The Instability of Correlations: Measurement and the Implications for Market Risk

An Extension of the Freund s Bivariate Distribution to Model Load Sharing Systems

Tail Dependence of Multivariate Pareto Distributions

Bivariate Weibull-power series class of distributions

Geometric Skew-Normal Distribution

Financial Econometrics and Volatility Models Copulas

A Convenient Way of Generating Gamma Random Variables Using Generalized Exponential Distribution

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

New Classes of Multivariate Survival Functions

On Bivariate Discrete Weibull Distribution

Analysis of Type-II Progressively Hybrid Censored Data

A Class of Weighted Weibull Distributions and Its Properties. Mervat Mahdy Ramadan [a],*

Bayesian Inference and Life Testing Plan for the Weibull Distribution in Presence of Progressive Censoring

Bivariate Sinh-Normal Distribution and A Related Model

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Marginal Specifications and a Gaussian Copula Estimation

COM336: Neural Computing

Extreme Value Analysis and Spatial Extremes

ABC methods for phase-type distributions with applications in insurance risk problems

Some Recent Results on Stochastic Comparisons and Dependence among Order Statistics in the Case of PHR Model

ASTIN Colloquium 1-4 October 2012, Mexico City

Stat 5101 Lecture Notes

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

A compound class of Poisson and lifetime distributions

Modelling Operational Risk Using Bayesian Inference

A Conditional Approach to Modeling Multivariate Extremes

Bayesian Inference for Clustered Extremes

Copulas. Mathematisches Seminar (Prof. Dr. D. Filipovic) Di Uhr in E

An Extension of the Generalized Exponential Distribution

On Bivariate Birnbaum-Saunders Distribution

Modified slash Birnbaum-Saunders distribution

Practice Problems Section Problems

Tail dependence in bivariate skew-normal and skew-t distributions

Introduction of Shape/Skewness Parameter(s) in a Probability Distribution

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

STA 4273H: Statistical Machine Learning

Generalized Exponential Distribution: Existing Results and Some Recent Developments

Estimation of parametric functions in Downton s bivariate exponential distribution

Some Generalizations of Weibull Distribution and Related Processes

Polynomial approximation of mutivariate aggregate claim amounts distribution

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

ON A MULTIVARIATE PARETO DISTRIBUTION 1

Availability and Reliability Analysis for Dependent System with Load-Sharing and Degradation Facility

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

Quantitative Biology II Lecture 4: Variational Methods

STA 4273H: Statistical Machine Learning

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Stochastic Comparisons of Weighted Sums of Arrangement Increasing Random Variables

Stochastic orders: a brief introduction and Bruno s contributions. Franco Pellerey

A note on Reversible Jump Markov Chain Monte Carlo

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Double Acceptance Sampling Based on Truncated Life Tests in Marshall Olkin Extended Lomax Distribution

An Extended Weighted Exponential Distribution

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Tail Approximation of Value-at-Risk under Multivariate Regular Variation

Parameter Estimation of Power Lomax Distribution Based on Type-II Progressively Hybrid Censoring Scheme

Introduction to Algorithmic Trading Strategies Lecture 10

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

Estimation of P (X > Y ) for Weibull distribution based on hybrid censored samples

Analysis of Middle Censored Data with Exponential Lifetime Distributions

Consideration of prior information in the inference for the upper bound earthquake magnitude - submitted major revision

Multivariate Survival Data With Censoring.

Multivariate Non-Normally Distributed Random Variables

Estimation Under Multivariate Inverse Weibull Distribution

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Estimating Bivariate Tail: a copula based approach

Correlation & Dependency Structures

Bayes Estimation and Prediction of the Two-Parameter Gamma Distribution

Point and Interval Estimation of Weibull Parameters Based on Joint Progressively Censored Data

On some mixture models based on the Birnbaum-Saunders distribution and associated inference

Accounting for Missing Values in Score- Driven Time-Varying Parameter Models

Parametric Techniques

A TWO-STAGE GROUP SAMPLING PLAN BASED ON TRUNCATED LIFE TESTS FOR A EXPONENTIATED FRÉCHET DISTRIBUTION

The Expectation-Maximization Algorithm

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Multivariate Birnbaum-Saunders Distribution Based on Multivariate Skew Normal Distribution

Sharp statistical tools Statistics for extremes

L11: Pattern recognition principles

Transcription:

arxiv:608.0299v3 [stat.co] 22 Jun 207 An EM algorithm for absolutely continuous Marshall-Olkin bivariate Pareto distribution with location and scale Arabin Kumar Dey and Debasis Kundu Department of Mathematics, IIT Guwahati Department of Mathematics and Statistics, IIT Kanpur Abstract: Recently Asimit, A. V., Furman, E. and Vernic, R (206) used EM algorithm to estimate singular Marshall-Olkin bivariate Pareto distribution. We describe absolutely continuous version of this distribution. We study estimation of the parameters by EM algorithm both in presence and without presence of location and scale parameters. Some innovative solutions are provided for different problems arised during implementation of EM algorithm. A real-life data analysis is also shown for illustrative purpose. Keywords and phrases: Joint probability density function, Absolutely Continuous bivariate Pareto distribution, EM algorithm, Pseudo likelihood function.. Introduction Recently Asimit, A. V., Furman, E. and Vernic, R (206) used EM algorithm to estimate singular Marshall-Olkin bivariate Pareto distribution. There is no paper on formulation and estimation of absolutely continuous version of this distribution. Few recent works (Kundu D. and Gupta RD. (2009), Kundu D. and Gupta RD. (200), Kundu D, Kumar A and Gupta AK. (205), Mirhosseini SM, Amini M, Kundu D and Dolati A. (205)) include Marshall- Olkin bivariate distribution. Estimation through EM algorithm can be seen there. But estimation of absolutely continuous version of bivariate pareto through EM algorithm is not straight forward. We suggest different innovative methods to solve the problem. The distribution can be used as an alternative model for the data transformed via peak over threshold method. Bivariate Pareto has wide application in modeling data related to finance, insurance, environmental sciences and internet network. The dependence structure of absolutely continuous version of bivariate Pareto can also be described by wellknown Marshall-Olkin copula [Nelsen, R. B. (2007), Marshall, A. W. (996)]. The methodology proposed in this paper works for moderately large sample. Block H. and Basu AP (974) proposed an absolutely continuous bivariate exponential distribution. Later Kundu D. and Gupta RD. (20) introduced an absolutely continuous bivariate generalized exponential distribution, whose marginals are not generalized exponential distributions. Many papers related to multivariate extreme value theory talk about absolutely continuous/ singular bivariate and multivariate Pareto distribution [Yeh, H. C. (2000), Yeh, H. C.

Dey and Kundu/Dey and Kundu 2 (2004), Smith, R. L. (994), Rakonczai, P. and Zempléni, A. (202)]. Development of multivariate pareto is an asymptotic process in extreme value theory. None of the paper mentioned specifically anything about Marshal-Olkin form of the distribution or its estimation through EM algorithm. One of the major problems with both Marshall-Olkin singular/absolutely continuous Pareto distribution with location and scale parameter is the verification of the assumption on the distributional form of the data. In this paper we mainly deal with the estimation issues through EM algorithm under the assumption that data follows absolutely continuous bivariate Pareto distribution. Even taking location and scale parameter as zero and one respectively, estimation of the Marshall-Olkin absolutely continuous Pareto distribution is not straight forward. We arrange the paper in the following way. In section 2 we keep the formulation of Marshall-Olkin bivariate Pareto and its absolutely continuous version. In section 3, we describe the EM algorithm for singular case. Different extension of EM algorithm for absolutely continuous version is available in section 4. Some simulation results show the performance of the algorithm in section 5. In section 6 we show data analysis. Finally we conclude the paper in section 7. 2. Formulation of Marshal-Olkin bivariate pareto A random variable X is said to have Pareto of second kind, i.e. X Pa(II)(µ, σ, α) if it has the survival function F X (x; µ, σ, α) P(X > x) (+ x µ σ ) α and the probability density function (pdf) f(x; µ, σ, α) α σ x µ (+ σ ) α with x > µ R, σ > 0 and α > 0. Let U 0, U and U 2 be three independent univariate pareto distributions Pa(II)(0,, α 0 ), Pa(II)(µ, σ, α ) and Pa(II)(µ 2, σ 2, α 2 ). We define X min{σ U 0 + µ, U } and X 2 min{σ 2 U 0 + µ 2, U 2 }. We can show that (X, X 2 ) jointly follow Marshall-Olkin bivariate Pareto distribution of second kind and we denote it as BVPA(µ, σ, µ 2, σ 2, α 0, α, α 2 ). The joint distribution can be given by where f(x, x 2 ) f (x, x 2 ) if x µ σ < x 2 µ 2 σ 2 f 2 (x, x 2 ) if x µ σ > x 2 µ 2 σ 2 f 0 (x) if x µ σ x 2 µ 2 σ 2 f (x, x 2 ) α σ σ 2 (α 0 + α 2 )(+ (x 2 µ 2 ) σ 2 ) (α 0+α 2 +) (+ (x µ ) σ ) (α +) x

f(x, y) y f(x, y) f(x, y) y f(x, y) y y Dey and Kundu/Dey and Kundu 3 f 2 (x, x 2 ) α 2 σ σ 2 (α 0 + α )(+ x µ σ ) (α 0+α +) (+ x 2 µ 2 σ 2 ) (α 2+) f 0 (x) α 0 σ (+ x µ σ ) (α +α 2 +α 0 +) We denote absolutely continuous part as BVPAC. Surface and contour plots of the pdf are shown in Figure- and Figure-2 respectively. Surface plot for µ,µ 22,σ 0.4,σ 20.5,α 02,α.2,α 2.4 Surface plot for µ 0,µ 20,σ,σ 20.5,α 0,α 0.3,α 2.4 x x (a) ξ (b) ξ 2 Surface plot for µ 0,µ 20,σ.4,σ 20.5,α 0,α,α 2.4 Surface plot for µ 0,µ 20,σ.4,σ 20.5,α 02,α 0.4,α 20.5 x x (c) ξ 3 (d) ξ 4 Fig. Density plots of BVPAC for different sets of parameters

2 0.5 0.0025 0.005 Dey and Kundu/Dey and Kundu 4 Surface plot for µ,µ 22,σ 0.4,σ 20.5,α 02,α.2,α 2.4 Surface plot for µ 0,µ 20,σ,σ 20.5,α 0,α 0.3,α 2.4 2.0 2.5 3.0 3.5.5 0.0 0.5.0.5 2.0 2.5 3.0 0.002 0.003 0.003 0.006 0.005 0.004 0.00.0.5 2.0 2.5 0.0 0.5.0.5 2.0 2.5 3.0 (a) ξ (b) ξ 2 Surface plot for µ 0,µ 20,σ.4,σ 20.5,α 0,α,α 2.4 Surface plot for µ 0,µ 20,σ.4,σ 20.5,α 02,α 0.4,α 20.5 0.0 0.5.0.5 2.0 2.5 3.0 0.00 0.002 0.003 0.0025 0.004 0.003 0.0035 0.0025 0.005 5e 04 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.00 0.005 0.002 0.0035 0.004 0.003 5e 04 0.0 0.5.0.5 2.0 2.5 3.0.0.5 2.0 2.5 3.0 3.5 4.0 (c) ξ 3 (d) ξ 4 Fig 2. Contour plots of the density of BVPAC for different sets of parameters 2.. Absolutely Continuous Extension We assume that the random vector (Z, Z 2 ) follows an absolutely continuous bivariate pareto distribution (as described in Kundu D. and Gupta RD. (200)). The joint pdf of BVPAC can be written as f BVPAC (z, z 2 ) { c f (z, z 2 ) c f PA (z ; µ, σ, α ) f PA (z 2 ; µ 2, σ 2, α 0 + α 2 ) if (z µ ) σ < (z 2 µ 2 ) σ 2 c f 2 (z, z 2 ) c f PA (z ; µ, σ, α 0 + α ) f PA (z 2 ; µ 2, σ 2, α 2 ) if (z µ ) σ > (z 2 µ 2 ) σ 2 where c is a normalizing constant and c α 0+α +α 2 α +α 2 pdf of univariate distribution. and f PA ( ) denotes the

Dey and Kundu/Dey and Kundu 5 Likelihood function for the parameters of the BVPAC can be written as L(µ, µ 2, σ, σ 2, α 0, α, α 2 ) n ln(α 0 + α + α 2 ) n ln(α + α 2 )+n ln α + n ln(α 0 + α 2 ) n ln σ n ln σ 2 (α 0 + α 2 + ) ln(+ z 2i µ 2 ) σ i I 2 (α + ) i I ln(+ z i µ σ ) n 2 σ n 2 ln σ 2 + n 2 ln α 2 + n 2 ln(α 0 + α ) (α 0 + α + ) ln(+ z i µ ) σ i I 2 (α 2 + ) ln(+ z 2i µ 2 ) σ i I 2 2 The three parameter or likelihood function taking µ µ 2 0 and σ σ 2 corresponding to this pdf can be given by L(α 0, α, α 2 ) (n + n 2 ) ln(α 0 + α + α 2 ) (n + n 2 ) ln(α + α 2 ) + i I ln f PA (z i ; 0,, α )+ i I ln f PA (z 2i ; 0,, α 0 + α 2 ) + ln f PA (z i ; 0,, α 0 + α )+ ln f PA (z 2i ; 0,, α 2 ) i I 2 i I 2 (n + n 2 ) ln(α 0 + α + α 2 ) (n + n 2 ) ln(α + α 2 ) + n ln α (α + ) i I ln(+z i )+n ln(α 0 + α 2 ) (α 0 + α 2 + ) i I ln(+ z 2i )+n 2 ln(α 0 + α ) (α 0 + α + ) i I 2 ln(+ z i )+n 2 ln α 2 (α 2 + ) i I 2 ln(+ z 2i ) Marginal distribution of Z and Z 2 can be obtained by routine calculation. Expressions are as follows : f Z (z ; µ, σ, α 0 + α ) c f PA (z ; µ, σ, α 0 + α ) c (α 0 + α + α 2 ) f PA(z ; µ, σ, α 0 + α + α 2 ) f Z2 (z 2 ; µ 2, σ 2, α 0 + α 2 ) c f PA (z 2 ; µ 2, σ 2, α 0 + α 2 ) c (α 0 + α + α 2 ) f PA(z 2 ; µ 2, σ 2, α 0 + α + α 2 ) respectively, where c (α 0+α +α 2 ). (α +α 2 ) α 0 α 0

Dey and Kundu/Dey and Kundu 6 3. EM-algorithm for singular three parameter BVPA distribution Now we divide our data into three parts :- I 0 {(x i, x 2i ) : x i x 2i },I {(x i, x 2i ) : x i < x 2i }, I 2 {(x i, x 2i ) : x i > x 2i } We do not know if X is U 0 or U. We also do not know if X 2 is U 0 or U 2. So we introduce two new random variables ( ; 2 ) as { 0 if X U 0 if X U and 2 { 0 if X 2 U 0 2 if X 2 U 2 The and 2 are the missing values of the E-M algorithm. To calculate the E-step we need the conditional distribution of and 2. Using the definition of X and X 2 and and 2 we have : For group I 0, both and 2 are known, 2 0 For group I, is known, 2 is unknown,, 2 0 or 2 Therefore we need to find out u P( 2 0 I )and u 2 P( 2 2 I ) For group I 2, 2 is known, is unknown, 0 or, 2 2 Moreover we need w P( 0 I 2 ) and w 2 P( I 2 ) Since, each posterior probability corresponds to one of the ordering from Table-3, we calculate u, u 2, w, w 2 using the probability of appropriate ordering. Ordering (X, X 2 ) Group U 0 < U < U 2 (U 0, U 0 ) I 0 U 0 < U 2 < U (U 0, U 0 ) I 0 U < U 0 < U 2 (U, U 0 ) I U < U 2 < U 0 (U, U 2 ) I U 2 < U 0 < U (U 0, U 2 ) I 2 U 2 < U < U 0 (U, U 2 ) I 2 We have the following expressions for u, u 2, w, w 2 u P(U < U 0 < U 2 ) P(U < U 0 < U 2 )+P(U < U 2 < U 0 )

Now we have u 2 w w 2 P(U < U 0 < U 2 ) Dey and Kundu/Dey and Kundu 7 P(U < U 2 < U 0 ) P(U < U 0 < U 2 )+P(U < U 2 < U 0 ) P(U 2 < U 0 < U ) P(U 2 < U 0 < U )+P(U 2 < U < U 0 ) P(U 2 < U < U 0 ) P(U 2 < U 0 < U )+P(U 2 < U < U 0 ) 0 [ (+ x) α ]α 0 (+ x) (α 0+) (+ x) (α 2) α 0 (+ x) (α 0+α 2 +) (+x) (α 0+α +α 2 +) 0 α 0 α (α 0 + α 2 + )(α 0 + α + α 2 + ) Using this above result we evaluate the other probabilites to get values of u, u 2, w, w 2 as : u α 0 α 0 +α 2 and u 2 α 2 α 0 +α 2 w α 0 α 0 +α and w 2 α α 0 +α 3.. Pseudo-likelihood expression We define n 0, n, n 2 as : n 0 I 0, n I, n 2 I 2. where I j for j 0,, 2 denotes the number of elements in the set I j. Now the pseudo log-likelihood can be written down as L(α 0, α, α 2 ) α 0 ( i I 0 ln(+ x i )+ i I 2 ln(+ x i )+ i I ln(+ x 2i )) + (n 0 + u n + w n 2 ) ln α 0 α ( i I 0 ln(+ x i )+ i I I 2 ln(+ x i )) + (n + w 2 n 2 ) ln α α 2 ( ln(+ x i )+ ln(+ x 2i )) i I 0 i I I 2 + (n 2 + u 2 n ) ln α 2 (3.) Therefore M-step involves maximizing (3.) with respect to α 0, α, α 2 at 0 n 0 + u (t) n + w (t) n 2 i I0 ln(+ x i )+ i I2 ln(+x i )+ i I ln(+ x 2i ). (3.2) 2 Therefore the algorithm can be given as (n + w (t) 2 n 2) i I0 ln(+ x i )+ i (I I 2 ) ln(+x i) n 2 + u (t) 2 n ( i I0 ln(+ x i )+ i (I I 2 ) ln(+x 2i)) (3.3) (3.4)

Dey and Kundu/Dey and Kundu 8 Algorithm EM procedure for bivariate Pareto distribution : while Q/Q < tol do 2: Compute u (i), u(i) 2, w(i), w(i) 2 from α (i) 0, α(i), α(i) 2. 3: Update α (i+) 0, α (i+), α (i+) 2 using Equation (3.2), (3.3) and (3.4). 4: Calculate Q for the new iterate. 5: end while 4. EM-algorithm for absolutely continuous case We adopt similar existing methods described in Kundu D. and Gupta RD. (200), we replace the missing observations n 0 falling within I 0 and each observation U 0 by its estimate 0 and E(U 0 U 0 < min{u, U 2 }). Note that in this case, n 0 is a random variable which has negative binomial with parameters (n + n 2 ) and α +α 2 α 0 +α +α 2. α 0 Clearly 0 (n + n 2 ) α +α 2, E(U 0 U 0 < min{u, U 2 }) (α 0 +α +α 2 ) An important restriction for this approximation is that we have to ensure α 0 + α + α 2 >. The restriction will ensure the existence of the above expectation. Therefore under above mentioned restriction, we write the pseudo likelihood function by replacing the missing observations with its expected value. L(α 0, α, α 2 ) α 0 ( 0 ln(+ a)+ ln(+ x i )+ ln(+ x 2i )) i I 2 i I + ( 0 + u n + w n 2 ) ln α 0 α ( 0 ln(+a)+ ln(+ x i )) i I I 2 + (n + w 2 n 2 ) ln α α 2 ( 0 ln(+ a)+ ln(+ x 2i )) i I I 2 + (n 2 + u 2 n ) ln α 2 (4.) Therefore M-step involves maximizing (4.) with respect to α 0, α, α 2 at 0 0 + u (t) n + w (t) n 2 0 ln(+ a)+ i I2 ln(+ x i )+ i I ln(+ x 2i ). (4.2) 2 (n + w (t) 2 n 2) 0 ln(+ a)+ i (I I 2 ) ln(+ x i) n 2 + u (t) 2 n ( 0 ln(+ a)+ i (I I 2 ) ln(+ x 2i)) (4.3) (4.4) Therefore the modified EM algorithm steps would be according to Algorithm- 2. Important Remark : The above method fails when α 0 + α + α 2 >.

Dey and Kundu/Dey and Kundu 9 Algorithm 2 Modified EM procedure for three parameter absolutely continuous bivariate Pareto distribution : while Q/Q < tol do 2: Compute u (i), u(i) 2, w(i), w(i) 2, 0, a from α (i) 0, α(i), α(i) 2. 3: Update α (i+) 0, α (i+), α (i+) 2 using Equation (4.2), (4.3) and (4.4). 4: Calculate Q for the new iterate. 5: end while 4.. Proposed Modification : To make the algorithm valid for any range of parameter, instead of estimating U 0 we estimate ln(+u 0 ) conditional on U 0 < min{u, U 2 }. Now since ln(x) is an increasing function in x, we condition on the equivalent event ln(+u 0 ) < min{ln(+u ), ln(+u 2 )}. Therefore we replace the unknown missing information n 0 and ln(+u 0 ) by and α 0 0 (n + n 2 ) α + α 2 a E(ln(+U 0 ) ln(+u 0 ) < min{ln(+u ), ln(+u 2 )}) (α 0 + α + α 2 ) respectively. Therefore M-step involves maximizing (4.) with respect to α 0, α, α 2 at 0 0 + u(t) n + w (t) n 2 0a + i I2 ln(+ x i )+ i I ln(+ x 2i ). (4.5) (n + w (t) 2 n 2) 0a + i (I I 2 ) ln(+ x i) (4.6) 2 n 2 + u (t) 2 n ( 0a + i (I I 2 ) ln(+ x 2i)) (4.7) Therefore the final EM steps for three parameters can be kept according to Algorithm-3. Algorithm 3 Final modified EM procedure for three parameter absolutely continuous bivariate Pareto distribution : while Q/Q < tol do 2: Compute u (i), u(i) 2, w(i), w(i) 2, 0, a from α (i) 0, α(i), α(i) 2. 3: Update α (i+) 0, α (i+), α (i+) 2 using Equation (4.5), (4.6) and (4.7). 4: Calculate Q for the new iterate. 5: end while

Dey and Kundu/Dey and Kundu 0 4.2. Algorithm for seven parameter absolutely continuous Pareto distribution For seven parameter case we first estimate the location parameters from the marginal distributions and keep it fixed. Estimates of location parameters can be replaced by minimum of the marginals of the data. Our EM algorithm update scale parameters and α 0, α, α 2 at every iteration. We use one step ahead gradient descend of σ and σ 2 based on likelihood of its marginal density. We can not use the bivariate likelihood function in gradient descend updates of σ and σ 2 as the likelihood function may not be differentiable with respect to σ and σ 2. Once we get an update of the scale parameters, we first fix I and I 2 and try to mimic the EM steps for α 0, α and α 2 discussed in previous section. We define z i (x i µ ˆ ) σˆ and z 2i (x 2i µ ˆ 2 ) σˆ 2. If estimates of location and scale parameters are correct(z i, Z 2i ) follow a bivariate Pareto distribution with parameter sets (0, 0,,, α 0, α, α 2 ). But one step ahead scale parameter won t be a correct estimate of scale parameters, however we assume approximate distribution of (Z i, Z 2i ) would be Pareto with parameter sets (0, 0,,, α 0, α, α 2 ). Therefore M-step expression would be same as before : 0 n 0 + u (t) n + w (t) n 2 i I0 ln(+z i )+ i I2 ln(+ z i )+ i I ln(+z 2i ). (4.8) 2 (n + w (t) 2 n 2) i I0 ln(+ z i )+ i (I I 2 ) ln(+ z i) n 2 + u (t) 2 n ( i I0 ln(+ z i )+ i (I I 2 ) ln(+ z 2i)) (4.9) (4.0) We make similar modifications of the above expressions. As usual observations within I 0 and cardinality of I 0 would be unknown in this case. We estimate i I0 ln(+z i ) by n0 a 0, where a 0 E(log(+U 0) log(+u 0 ) < min{log(+u ), log(+u 2 )}) (α 0 +α +α 2 ) and n 0 by 0 (n α + n 2 ) 0 α +α 2. Therefore final M-step expression would be same as before : 0 0 + u(t) n + w (t) n 2 0 a 0 + i I 2 ln(+z i )+ i I ln(+ z 2i ). (4.) 2 (n + w (t) 2 n 2) 0 a 0 + i (I I 2 ) ln(+ z i) n 2 + u (t) 2 n ( 0 a 0 + i (I I 2 ) ln(+ z 2i)) (4.2) (4.3)

Dey and Kundu/Dey and Kundu The key intuition of this algorithm is very similar to stochastic gradient descend. As we increase the number of iteration, EM steps try to force the three parameters α 0, α, α 2 to pick up the right direction starting from any value and gradient descent steps of σ and σ 2 gradually ensure them to roam around the actual values. One of the drawback with this approach is that it considers too many approximations. However this algorithm works even for moderately large sample sizes. Sometimes it takes lot of time to converge or roam around the actual value for some really bad sample. Probability of such events are very low. We stop the calculation after 2000 iteration in such cases. Our numerical result in next section shows that this does not make any significant effect in the calculation of mean square error. Therefore algorithmic steps of our proposed algorithm for seven parameter absolutely continuous bivariate Pareto distribution can be kept according to Algorithm-4. Algorithm 4 EM procedure for seven parameter absolutely continuous bivariate Pareto distribution : Take minimum of the marginals as estimates of the location parameters. Initialize other parameters. 2: while Q/Q < tol do 3: Compute one step ahead gradient descend update of scale parameters based on likelihood function of the marginals. 4: Fix I and I 2 based on the estimated location and scale parameters. 5: Compute u (i), u(i) 2, w(i), w(i) 2, 0, a 0 from α(i) 0, α(i), α(i) 2. 6: Update α (i+) 0, α (i+), α (i+) 2 using Equation (4.), (4.2) and (4.3). 7: Calculate Q for the new iteration. 8: end while 5. Numerical Results We use package R 3.2.2 to perform the estimation procedure. All the programs will be available to author on request. We generate samples of different sizes to calculate the average estimates based on 000 simulations. We also calculate mean square error and parametric bootstrap confidence interval for all parameters. The result shown in Table- deals only with three unknown parameters following Algorithm-3. We start EM algorithm taking initial guesses for the parameters as α 0 2, α 2.2, α 2 2.4. We try other initial guesses, but average estimates and MSEs are same. Estimates are calculated based on sample sizes n 50, 50, 250, 350, 450 respectively. We use stopping criteria as absolute value of likelihood changes with respect to previous likelihood at each iteration. In stopping criteria we use tolerance value as 0.0000. The results shown here are based on modified algorithm which works for any choice of parameters within its usual range. The average estimates (AE), mean squared error (MSE), 95% parametric bootstrap confidence intervals are obtained based on 000 replications.

Dey and Kundu/Dey and Kundu 2 Table-2 provides the estimates in presence of location and scale parameters following Algorithm-4. In this case, we take sample size n 450, 550, 000, 500. However the algorithm works even for smaller sample sizes, although mean square errors are little higher. Since EM algorithm starts after plug-in the estimates of location parameters as minimum of the maginals, we take initial values of other parameters as α 0 2, α.2, α 2.4, σ 0.6, σ 2 0.7. We find average estimates are closer to true values of the parameters. However parametric bootstrap confidence interval does not reflect good results for small samples in both the cases. The interval width gets shorter as sample size increases. parameters α 0 α α 2 n 50 (AE).634 0.678 0.807 MSE 0.594 0.278 0.370 Parametric Bootstrap [0.56, 2.733] [0.0003,.479] [0.0004,.759] 50 (AE).865 0.4944abs 0.624 MSE 0.233 0.089 0.25 Parametric Bootstrap [.023, 2.7] [0.08,.066] [0.02,.267] 250 (AE).9227 0.467 0.577 MSE 0.25 0.054 0.074 Parametric Bootstrap [.86, 2.630] [0.044, 0.948] [0.047,.50] 350 (AE).935 0.428 0.543 MSE 0.09 0.033 0.050 Parametric Bootstrap [.37, 2.595] [0.064, 0.800] [0.087, 0.976] 450 (AE).966 0.427 0.528 MSE 0.06 0.027 0.039 Parametric Bootstrap [.505, 2.486] [0.2, 0.736] [0.38, 0.898] Table The average estimates (AE), the Mean Square Error (MSE) and parametric bootstrap confidence interval for α 0 2, α 0.4 and α 2 0.5

Dey and Kundu/Dey and Kundu 3 parameters µ µ 2 σ n 450 AE 0.0 0.0 0.753 MSE 0.0000038 0.0000023 0.0203 Parametric Bootstrap [0.0005, 0.05] [0.0003, 0.0383] [0.5903, 0.9989] 450 σ 2 α 0 α α 2 AE 0.804.820 0.458 0.626 MSE 0.059 0.09738 0.02546 0.076 Parametric Bootstrap [0.5554,.03] [.35, 2.358] [0.708, 0.7569] [0.982,.439] 550 parameters µ µ 2 σ AE 0.07 0.009 0.762 MSE 0.0000026 0.000006 0.02 Parametric Bootstrap [0.00039, 0.03985] [0.00094, 0.0395] [0.5832,.0402] 550 σ 2 α 0 α α 2 AE 0.807.820 0.466 0.632 MSE 0.04 0.095 0.026 0.076 Parametric Bootstrap [0.565,.026] [.3666, 2.3576] [0.92, 0.7603] [0.2076,.0779] 000 parameters µ µ 2 σ AE 0.006574 0.004984 0.87 MSE 0.000000848 0.00000050 0.0085 Parametric Bootstrap [0.000, 0.0236] [0.000, 0.089] [0.6334,.0226] 000 σ 2 α 0 α α 2 AE 0.793.855 0.48 0.63 MSE 0.02 0.0655 0.0260 0.0622 Parametric Bootstrap [0.626,.0202] [.4660, 2.2834] [0.26, 0.74] [0.2482, 0.9565] 500 parameters µ µ 2 σ AE 0.004 0.003 0.8095 Parametric Bootstrap [0.000085, 0.055] [0.000054, 0.03] [0.6550, 0.9965] 500 σ 2 α 0 α α 2 AE 0.823.8892 0.4790 0.636 Parametric Bootstrap [0.6426, 0.9938] [.5027, 2.253] [0.2336, 0.7488] [0.2656, 0.9667] Table 2 The average estimates (AE), the Mean Square Error (MSE) and parametric bootstrap confidence interval for µ 0., µ 2 0., σ 0.8, σ 2 0.8, α 0 2, α 0.4 and α 2 0.5 6. Data Analysis We take portfolio of two stocks, American Express (axp) and Standard and Poor s 500 (sp500). Data show the simple returns of two companies from 04/09/200 to 30/09/20 collected from NYSE. First we transform the simple return series into percentage of log-return of two stocks. From Falk, M. and Guillou, A. (2008), we know that peak over threshold method on random variable U provides polynomial generalized Pareto distribution for any x 0 with +log(g(x 0 )) (0, ) i.e. P(U > tx 0 U > x 0 ) t α, t where G( ) is the distribution function of U. We choose appropriate t and x 0 so that data should behave more like Pareto distribution. The transformed data set does not have any singular component. Therefore one possible assumption can be absolutely continuous Marshall Olkin bivariate Pareto. We verify our assumption in different ways. We fit the empirical survival functions with the marginals of this bivariate Pareto whose parameters can be obtained from the proposed EM algorithm. We find really good fit for both the marginals in Figure-3. The data analysis is performed based on sample size 263. We also verify our assumption by plotting empirical two dimensional density plot in Figure-4 which resembles closer to the surface of Marshall-Olkin bivariate Pareto distribution.

Z t Dey and Kundu/Dey and Kundu 4 Empirical Survival Plot Empirical Survival Plot Survival Function 0.0 0.2 0.4 0.6 0.8.0 Survival Function 0.0 0.2 0.4 0.6 0.8.0 0.0 0.5.0.5 2.0 marginalaxp 0.0 0.2 0.4 0.6 0.8.0 marginalaxp2 (a) ξ (b) ξ 2 Fig 3. Survival plots for two marginals of the transformed dataset Surface Plot of empirical density s Fig 4. Two dimensional density plots for the transformed dataset The estimated parameters can be given as µ 0.088, µ 2 0.088, σ 4.944, σ 2 2.652, α 0 2.559, α 0.824 and α 2 0.958. The parametric bootstrap confidence interval calculated for the parameters can be given as µ : [0.088, 0.096], µ 2 : [0.088, 0.09], σ : [.78, 4.42], σ 2 : [.099, 2.788], α 0 : [.2, 2.643], α : [0.247, 0.805], α 2 : [0.429,.396]. 7. Conclusion: We observe successful implementation of EM algorithm for absolutely continuous distribution. The approximation process runs in multiple stages. Therefore the algorithm works better for larger sample size. Estimation procedure even in the case of three parameters is not straightforward. The paper also shows some innovative approach to handle the estimation in case of location and scale parameters. The solution is wonderful, though there is very low probability that for some bad sample it may not work. But that s not a problem as it will roam around closer to the actual value. Since the likelihood function

Dey and Kundu/Dey and Kundu 5 is discontinuous with respect to location and scale parameters, it is very difficult to construct aymptotic confidence interval. Our approach by parametric bootstrap confidence interval provides useful solution of this problem. Some approximate bayesian techniques like Lindley requires maximum likelihood estimates. Our EM algorithm can be a good alternative in such cases. In fact bayesian estimation separately can be explored for this distribution. The work is in progress. References Arnold, B. (967), A note on multivariate distributions with specified marginals, Journal of the American Statistical Association, 62, 460-46. Asimit V., Furman, E. and Vernic, R. (200), On a Multivariate Pareto Distributions, Insurance: Mathematics and Economics, 46, 308-36. Asimit, A. V, Vernic, R. and Zitikis, R. (203), Evaluating risk measures and capital allocations based on multi-losses driven by a heavy-tailed background risk: the multivariate Pareto-II model, Risks,, 4 33. Asimit, A. V., Furman, E. and Vernic, R. (206), Statistical Inference for a New Class of Multivariate Pareto Distributions, Communications in Statistics- Simulation and Computation, 45, 456-47. Balakrishna, N., Gupta, R. C., Kundu, D., Leiva, V. and Sanhueza, A. On some mixture models based on the Birnbaum-Saunders distribution and associated inference, Journal of Statistical Planning and Inference, 4, 275-290. Block H. and Basu AP. A continuous bivariate exponential extension. Journal of the American Statistical Association, 69, 03-037. Campbell, J. I. and S. Austin. Effects of response time deadlines on adults strategy choices for simple addition. Memory & Cognition, 30, 988 994. Dey, A. K. and Kundu, D. Discriminating between bivariate generalized exponential and bivariate Weibull distributions. Chilean Journal of Statistics, 3, 93 0. Falk, M. and Guillou, A. Peaks-over-threshold stability of multivariate generalized Pareto distributions. Journal of Multivariate Analysis, 99, 75 734. Gupta RD and Kundu D. Generalized Exponential Distributions. Australian and New Zealand Journal of Statistics, 4, 73-88. Gupta RD and Kundu D. Absolute Continuous Bivariate Generalized Exponential Distribution. Advances in Statistical Analysis, 95, 69-85. Hanagal, D. D. A multivariate Pareto distribution. Communications in Statistics- Theory and Methods, 25, 47 488. Marshall AW. and Olkin I. A Multivariate Exponential distribution. Journal of the American Statistical Association, 62, 30-44. Ristic, Miroslav and Kundu, D. Marshall-Olkin generalized exponential distribution, Metron. 73, 37-333. Kundu D. On Sarhan-Balakrishnan bivariate distribution. Journal of Statistical Applications & Probability,, 63-70. Khosravi, M, Kundu, D. and Jamalzadeh, A. On bivariate and mixture of bivariate Birnbaum-Saunders distributions. 23, - 7.

Dey and Kundu/Dey and Kundu 6 Kundu, D. and Dey, AK. Estimating parameters of the Marshall Olkin bivariate Weibull Distribution by EM Algorithm. Computational Statistics and Data Analysis, 53, 956-965. Kundu D. and Gupta RD. A class of absolute continuous bivariate distributions. Statistical Methodology, 7, 464-477. Kundu D. and Gupta RD. A class of absolute continuous bivariate distributions. Statistical Methodology, 7, 464-477. Kundu D. and Gupta RD. Absolute continuous bivariate generalized exponential distribution. AStA Advances in Statistical Analysis, 95, 69 85. Kundu D, Kumar A and Gupta AK. Absolute continuous multivariate generalized exponential distribution. Sankhya, Ser. B., 77, 75-206. Marshall, A. W. Copulas, marginals, and joint distributions. Lecture Notes- Monograph Series, 23 222. Marshall, A. W. and Olkin, I. A multivariate exponential distribution. Journal of the American Statistical Association, 62, 30 44. McLachlan, Geoffrey and Peel, D. Finite Mixture Models. Wiley Series in Probability and Statistics. Mirhosseini SM, Amini M, Kundu D and Dolati A. On a new absolutely continuous bivariate generalized exponential distribution. Statistical Methods and Applications, 24, 6-83 Nelsen, R. B. An introduction to copulas. Springer Science & Business Media. Rakonczai, P. and Zempléni, A. Bivariate generalized Pareto distribution in practice: models and estimation. Environmetrics, 23, 29 227. Sarhan, A. M. and Balakrishnan, N. A new class of bivariate distributions and its mixture. Journal of Multivariate Analysis, 98, 508 527. Smith, R. L. Multivariate threshold methods : Extreme Value Theory and Applications. Springer, 98, 225 248. Smith, R. L., Tawn, J. A. and Coles, S. G. Markov chain models for threshold exceedances. Biometrika, 84, 249 268. Tsay, R. S. An introduction to analysis of financial data with R. Biometrika, Johnson Wiley & Sons. Yeh, H. C. Two multivariate Pareto distributions and their related inferences. Bulletin of the Institute of Mathematics, Academia Sinica, 28, 7 86. Yeh, H. C. Some properties and characterizations for generalized multivariate Pareto distributions. Journal of Multivariate Analysis, 88, 47 60.