The LmB Conferences on Multivariate Count Analysis

Similar documents
Extended Poisson-Tweedie: properties and regression models for count data

Extended Poisson Tweedie: Properties and regression models for count data

Empirical Comparison of ML and UMVU Estimators of the Generalized Variance for some Normal Stable Tweedie Models: a Simulation Study

Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

On Equi-/Over-/Underdispersion. and Related Properties of Some. Classes of Probability Distributions. Vladimir Vinogradov

Multivariate Regression Models in R: The mcglm package

On Lévy measures for in nitely divisible natural exponential families

Using Estimating Equations for Spatially Correlated A

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Lecture 1: August 28

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

The combined model: A tool for simulating correlated counts with overdispersion. George KALEMA Samuel IDDI Geert MOLENBERGHS

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

A Note on Weighted Count Distributions

Zero inflated negative binomial-generalized exponential distribution and its applications

Lecture 2: Repetition of probability theory and statistics

Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory

Institute of Actuaries of India

DELTA METHOD and RESERVING

Algorithms for Uncertainty Quantification

CHAPTER 1 INTRODUCTION

Multivariate Normal-Laplace Distribution and Processes

Estimating function analysis for a class of Tweedie regression models

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;

High-Throughput Sequencing Course

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Part 8: GLMs and Hierarchical LMs and GLMs

1/15. Over or under dispersion Problem

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

ON GENERALIZED VARIANCE OF NORMAL-POISSON MODEL AND POISSON VARIANCE ESTIMATION UNDER GAUSSIANITY

On Hinde-Demetrio Regression Models for Overdispersed Count Data

Poisson Regression. Ryan Godwin. ECON University of Manitoba

PRINCIPLES OF STATISTICAL INFERENCE

Statistics: A review. Why statistics?

Tail negative dependence and its applications for aggregate loss modeling

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Introduction to Reliability Theory (part 2)

Nonparametric estimation of the number of zeros in truncated count distributions

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

MODELING COUNT DATA Joseph M. Hilbe

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Multivariate negative binomial models for insurance claim counts

MIXED POISSON DISTRIBUTIONS ASSOCIATED WITH HAZARD FUNCTIONS OF EXPONENTIAL MIXTURES

Mathematical statistics

Covariance function estimation in Gaussian process regression

Introduction to Algorithmic Trading Strategies Lecture 10

Probability Distributions Columns (a) through (d)

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data

Classification. Chapter Introduction. 6.2 The Bayes classifier

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Introduction to Bayesian Inference

ZERO INFLATED POISSON REGRESSION

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Introduction to Rare Event Simulation

Generalized linear mixed models (GLMMs) for dependent compound risk models

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Generalized linear mixed models for dependent compound risk models

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.

Reading Material for Students

Statistical techniques for data analysis in Cosmology

Local Mixtures and Exponential Dispersion Models

A Practitioner s Guide to Generalized Linear Models

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Numerical Analysis for Statisticians

Applied Probability and Stochastic Processes

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

1 Presessional Probability

Subject CS1 Actuarial Statistics 1 Core Principles

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Double generalized linear compound poisson models to insurance claims data

Brief Review on Estimation Theory

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

IE 303 Discrete-Event Simulation

Standard Error of Technical Cost Incorporating Parameter Uncertainty

Student-t Process as Alternative to Gaussian Processes Discussion

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

ABC methods for phase-type distributions with applications in insurance risk problems

Lecture-19: Modeling Count Data II

Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution

MAS223 Statistical Inference and Modelling Exercises

Kernel families of probability measures. Saskatoon, October 21, 2011

Semiparametric Generalized Linear Models

Discrete Choice Modeling

Frailty Modeling for clustered survival data: a simulation study

Stat 5101 Notes: Brand Name Distributions

Simulating Realistic Ecological Count Data

RMSC 2001 Introduction to Risk Management

Outline of GLMs. Definitions

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Extreme Value Analysis and Spatial Extremes

Testing Statistical Hypotheses

Transcription:

The LmB Conferences on Multivariate Count Analysis Title: On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data Rahma ABID, C.C. Kokonendji & A. Masmoudi Email Address: rahma.abid.ch@gmail.com Besançon: 2018.07.06

Some related works: Abid et al. (2018a). Geometric dispersion models with quadratic v-functions. Submission Needing Revision. Abid et al. (2018b). Geometric Tweedie regression models for continuous and semicontinuous data with variation phenomenon. Submitted for publication. Abid et al. (2018c). On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data. To submit asap. Abid et al. (2018d). Multivariate Poisson-exponential-Tweedie regression models for the analysis of maintenance building data. Work in progress. Rahma ABID, Poisson-exponential-Tweedie regression models 2

Outline: 1 Introduction: Motivations 2 Poisson-exponential-Tweedie (PET) models (= Geometric Poisson-Tweedie) 3 PET regression models 4 Simulation studies and applications 5 Conclusion & Perspectives Rahma ABID, Poisson-exponential-Tweedie regression models 3

1. Introduction: Motivations The overdispersion phenomenon is frequent and can be induced by the zero-inflation ones, these phenomena are defined w.r. to Poisson; see, e.g., ab c d. Count models have been built through compounding and mixture of Poisson distribution. 1) How to do when the degree or level of overdispersion is very high? 2) Should we relativize its measure with respect to another reference count distribution than Poisson? 3) How to built a new family of ultra-overdispersed count models? a Hinde, J. and Demétrio, C.G.B. (1998). Overdispersion: Models and Estimation. Associacao Brasileira de Estatistica, Sao Paulo. b Kokonendji, C.C., Dossou-Gbete, S. and Demétrio, C.G.B. (2004). Some discrete exponential dispersion models: Poisson-Tweedie and Hinde-Demetrio classes. Statistics and Operations Research Transactions, 28, 201-214. c Kokonendji, C.C., Demétrio, C.G.B. and Zocchi, S.S. (2007). On Hinde-Demétrio regression models for overdispersed count data. Statistical Methodology 4, 271-291. d Bonat, W.H., Jørgensen, B., Kokonendji, C.C., Hinde, J. & Demétrio, C.G.B. (2018). Extended Poisson-Tweedie: properties and regression models for count data. Statistical Modelling 18:24-49. Rahma ABID, Poisson-exponential-Tweedie regression models 4

Example of application Application to the repairable systems in reliability e : Data of number of buildings subject to maintenance (e.g., plumbing, roof, heating, cooling system, etc.) until the year 1985. The buildings are belonging to one system being 46 years old: (Y 1,..., Y i,..., Y 46 ). Dispersion index = 187339 / 205.717 = 910.662 1 : ultra-overdispersed. Some relative references: - Jørgensen, B. & Kokonendji, C.C. (2011). Dispersion models for geometric sums. Brazilian Journal of Probability and Statistics 25:263-293. - Jørgensen, B. and Kokonendji, C.C. (2016). Discrete dispersion models and their Tweedie asymptotics. AStA Advances in Statistical Analyses 100:43-78. e Yeoeman, A. (1987). Forecasting Building Maintenance Using The Weibull Process, M.S.Thesis, University of Missouri-Rolla, United States. Rahma ABID, Poisson-exponential-Tweedie regression models 5

Geometric sums of count random variables A geometric sum of Poisson-Tweedie models defined by Y = G PT l, where PT 1, PT 2,... are i.i.d. such a Poisson-Tweedie PT random variable and G Geom(q). Decomposition of number of buildings maintenance actions Y i having age i as a geometric sum of maintenance actions per building. l=1 Rahma ABID, Poisson-exponential-Tweedie regression models 6

Geometric sums of count random variables A geometric sum of Poisson-Tweedie models defined by Y = G PT l, where PT 1, PT 2,... are i.i.d. such a Poisson-Tweedie PT random variable and G Geom(q). Decomposition of number of buildings maintenance actions Y i having age i as a geometric sum of maintenance actions per building. For p {0} [1, ), the class of Poisson-Tweedie (PTw p ( m, φ)): l=1 Z Tw p ( m, φ) and PT Z Poisson(Z) PT PTw p ( m, φ) has moments EPT = m and VarPT = m + φ m p. Given the expectation m = EY, its variance is of the form VarY = m + m 2 + φm p. Rahma ABID, Poisson-exponential-Tweedie regression models 6

2.Poisson-exponential-Tweedie (PET) models (= Geometric Poisson-Tweedie) The class of Exponential-Poisson-Tweedie: X Exp(1), [Y X] Z Poisson(Z) and Z Tw p (Xm, X 1 p φ). (1) The class of Poisson-exponential-Tweedie (PETw p (m, φ)): Y Z Poisson(Z), Z Tw p (Xm, X 1 p φ) and X Exp(1). (2) Proposition (Abid et al., 2018c) Let Y 1 and Y 2 two random variables defined by (1) and (2), respectively. Then (i) Y 1 and Y 2 have the same distributions. (ii) VarY 1 = m + m 2 + φm p. Rahma ABID, Poisson-exponential-Tweedie regression models 7

Table: Summary of PET models with support S p = N of distributions. Type(s) of PET = Geometric PT p Type(s) of Tweedie Geometric Hermite p = 0 Gaussian [Do not exist] 0 < p < 1 [Do not exist] Geometric Neyman Type A p = 1 Poisson Geometric Poisson compound Poisson 1 < p < 2 Gamma compound Poisson Geometric Pólya-Aeppli p = 3/2 Non-central gamma Geometric negative binomial p = 2 Gamma Geometric Poisson positive stable p > 2 Positive stable Geometric Poisson-inverse Gaussian p = 3 Inverse Gaussian Rahma ABID, Poisson-exponential-Tweedie regression models 8

Table: Summary of PET models with support S p = N of distributions. Type(s) of PET = Geometric PT p Type(s) of Tweedie Geometric Hermite p = 0 Gaussian [Do not exist] 0 < p < 1 [Do not exist] Geometric Neyman Type A p = 1 Poisson Geometric Poisson compound Poisson 1 < p < 2 Gamma compound Poisson Geometric Pólya-Aeppli p = 3/2 Non-central gamma Geometric negative binomial p = 2 Gamma Geometric Poisson positive stable p > 2 Positive stable Geometric Poisson-inverse Gaussian p = 3 Inverse Gaussian Y PETw p (m, φ) has pmf P(Y = y) = 0 0 exp{ z x}z y Tw p (mx, φx 1 p )(z)dzdx. y! No closed-form available Approximation by Monte Carlo integration and Tweedie simulations rtweedie() in R (Dunn, 2013). Estimation and inference based on the likelihood function is difficult Model selection: estimation of parameters by regression. Rahma ABID, Poisson-exponential-Tweedie regression models 8

3. PET regression models Dispersion and zero-inflation indexes w.r. to Poisson: P-DI = VarY EY andf P-ZI = EY + log P(Y = 0). f Other definitions: P-DI = (VarY EY)/EY and P-ZI = 1 + log P(Y = 0)/EY. Rahma ABID, Poisson-exponential-Tweedie regression models 9

3. PET regression models Dispersion and zero-inflation indexes w.r. to Poisson: P-DI = VarY EY andf P-ZI = EY + log P(Y = 0). Dispersion and zero-inflation indexes w.r. to negative binomial: NB-DI = VarY EY + (EY) 2 and NB-ZI = log(1 + EY) + log P(Y = 0). Heavy tail index is independent of the reference model: HT = P(Y = y + 1) P(Y = y) for y. Proposition (Abid et al., 2018c) PET is overdispersed and zero-inflated w.r. to P and NB, respectively. f Other definitions: P-DI = (VarY EY)/EY and P-ZI = 1 + log P(Y = 0)/EY. Rahma ABID, Poisson-exponential-Tweedie regression models 9

Dispersion indexes of PET w.r. to Poisson and NB Figure: Dispersion indexes of PET distribution as a function of m by dispersion and power parameters. Rahma ABID, Poisson-exponential-Tweedie regression models 10

Zero-inflation indexes of PET w.r. to Poisson and NB Figure: Zero-inflation indexes of PET distribution as a function of m by dispersion and power parameters. Rahma ABID, Poisson-exponential-Tweedie regression models 11

Estimation and inference: Quasi likelihood approach Consider a cross-sectional data set, (y i, x i ), i = 1,..., n, where y i are i.i.d realizations of Y i PETw p (m i, φ), x i and β are (Q 1) vectors of known covariates and unknown regression parameters, respectively. EY i = m i = exp(x i β) VarY i = m i + m 2 i + φm p i = V i. Rahma ABID, Poisson-exponential-Tweedie regression models 12

Estimation and inference: Quasi likelihood approach Consider a cross-sectional data set, (y i, x i ), i = 1,..., n, where y i are i.i.d realizations of Y i PETw p (m i, φ), x i and β are (Q 1) vectors of known covariates and unknown regression parameters, respectively. EY i = m i = exp(x i β) VarY i = m i + m 2 i + φm p i = V i. Models with m 2 p m 1 p < φ < 0 are permitted May be no specific probability distribution Rahma ABID, Poisson-exponential-Tweedie regression models 12

Dispersion indexes of PET w.r. to Poisson and NB for φ < 0 Figure: Dispersion indexes for PET distribution by negative dispersion and power parameters. Rahma ABID, Poisson-exponential-Tweedie regression models 13

Dominant features of PET models Table: Reference models and dominant features by dispersion and power parameter values in respect to the Poisson and negative binomial models. Reference PET Dominant features Dispersion Power Poisson/negative binomial Equi/Equi - Geometric Hermite Over, under φ 0 p = 0 Geometric Neyman Type A Over, under, ZI φ 0 p = 1 Geometric Poisson compound Poisson Over, under, ZI φ 0 1 < p < 2 Geometric Pólya-Aeppli Over, under, ZI φ 0 p = 1.5 Geometric negative binomial Over, under φ 0 p = 2 Geometric Poisson positive stable Over, HT φ > 0 p > 2 Geometric Poisson-inverse Gaussian Over, HT φ > 0 p = 3 Rahma ABID, Poisson-exponential-Tweedie regression models 14

Estimating function approach The quasi-score function for β: n m i ψ β (β, γ) = V 1 i (y i m i ),..., β 1 i=1 n m i V 1 i (y i m i ) β. Q The Pearson estimating function for variance parameters γ = (φ, p): n ψ γ (β, γ) = V 1 n i φ {(y V 1 i m i ) 2 i V i }, p {(y i m i ) 2 V i }. i=1 The chaser algorithm (Jørgensen & Knudsen, 2004) is defined by i=1 i=1 β (i+1) = β (i) S 1 β ψ β(β (i), φ (i) ) with S βjk γ (i+1) = γ (i) αs 1 γ ψ γ (β (i+1), γ (i) ) = E ( ψ β βj (β, φ) ) and S γjk = n k i=1 V 1 i γ j V V i 1 i V γ i k Rahma ABID, Poisson-exponential-Tweedie regression models 15

4.1 Simulation studies The expectation and the variance of the PET random variable are given by m i = exp(β 0 + β 1 x 1i + β 2 x 2i ) and V i = m i + m 2 i + φm p, i where x 1 and x 2 are sequences from 1 to 1 with length equals to the sample size. The regression coefficients were fixed at the values, β 0 = 1, β 1 = 1 and β 2 = 0.9. We use different sample sizes (n = 500, 1000 and 5000) generating 1000 data sets in each case. We considered three values of the Tweedie power parameter p = 1.5, 2, 3 combined with three values of the dispersion parameter φ = 0.5, 1, 1.5. Rahma ABID, Poisson-exponential-Tweedie regression models 16

Average bias for parameters Figure: Average bias for each parameter by sample size and simulation scenarios. Rahma ABID, Poisson-exponential-Tweedie regression models 17

Confidence intervals for parameters Figure: Confidence intervals for each parameter by simulation scenarios. Rahma ABID, Poisson-exponential-Tweedie regression models 18

4.2 Three Applications 4.2.1. Accidents of private cars in Switzerland (Klugman, 2004): Data analysed in Aryuyuen and Bodhisuwan (2013) using the NB-GE distribution. P-DI = 1.154; NB-DI = 0.999 P-overdispersed; NB-equidispersed. P-ZI = 0.154; NB-ZI = 2.709 P-zero-inflated; NB-zero-deflated. Rahma ABID, Poisson-exponential-Tweedie regression models 19

4.2 Three Applications 4.2.1. Accidents of private cars in Switzerland (Klugman, 2004): Data analysed in Aryuyuen and Bodhisuwan (2013) using the NB-GE distribution. P-DI = 1.154; NB-DI = 0.999 P-overdispersed; NB-equidispersed. P-ZI = 0.154; NB-ZI = 2.709 P-zero-inflated; NB-zero-deflated. Table: Parameter estimates for different models. Number of Observed Fitting distributions accidents frequencies Poisson NB NB-GE PT PET 0 103704 102629.6 103723.6 103708.8 103708.1 103708.8 1 14075 15922.0 13989.9 14046.8 14041.3 14044.9 2 1766 1235.1 1857.1 1797.8 1809.6 1797.3 3 255 + 66.3 245.2 251.7 250.3 259.1 4 45 + 37.2 36.0 36.9 36.6 5 6 + 12 + 6.8 + 6.3 6 2 +7 0 Parameters λ = 0.155 r = 1.032 r = 2.431 p = 2.600 p = 1.950 estimates q = 0.150 α = 3.289 φ = 3.000 φ = 0.050 β = 31.278 m = 0.155 m = 0.155 Chi-squares 2 1332.300 12.120 4.260 3.209 2.932 p-value < 0.0001 0.0023 0.1180 0.2010 0.2300 Rahma ABID, Poisson-exponential-Tweedie regression models 19

4.2.2. Accident occurrence in car insurance on Tunisian data: Data from an insurer who operates in the market for automobile insurance in Tunisia. P-DI = 9.100; NB-DI = 6.204 NB-overdispersed. P-ZI = 0.268; NB-ZI = 0.185 NB-zero-inflated. Table: Parameter estimates and standard errors (SE) for PET and PT models; paic for models. Parameter PET PT Intercept 0.267 (0.111) 0.267 (0.111) Car age 0.105 (0.003) 0.105 (0.003) Car power 0.112 (0.009) 0.112 (0.009) Driver age 0.102 (0.002) 0.102 (0.002) φ 0.529 (0.093) 1.741 (0.096) p 2.840 (0.149) 2.329 (0.075) paic 15287.230 15298.930 Rahma ABID, Poisson-exponential-Tweedie regression models 20

4.2.3. Buildings maintenance data (Yeoeman, 1987): Data on the number of occurrences of repairs for 2 549 buildings. The number of maintenance for all buildings is available during four years: 1982, 1983, 1984, 1985. For a given year i, the total number of buildings maintenance Y i in the i-th time frame follows the PET model PETw p (m i, φ), i = 1,..., 46. Table: Estimated dispersion and zero-inflation indexes of datasets. Dataset P-DI NB-DI P-ZI NB-ZI No 1 (1982) 328.692 5.224 60.550 2.789 No 2 (1983) 243.826 7.393 30.539 2.232 No 3 (1984) 619.217 4.253 142.707 3.232 No 4 (1985) 910.662 4.405 203.660 3.295 Rahma ABID, Poisson-exponential-Tweedie regression models 21

Table: Parameter estimates and standard errors (SE) for PET and PT indicated by italic symbols; paic for models. Parameter No 1 (1982) No 2 (1983) No 3 (1984) No 4 (1985) Intercept 33.904 (0.299) 0.039 (0.048)147.718 (0.298)168.397 (0.300) 33.904 (0.299) -1.130 (0.477) 147.718 (0.298) 168.397 (0.300) Age 3.833 (0.010) 6.512 (0.058) 9.417 (0.111) 7.795 (0.111) 3.833 (0.010) 7.100 (0.144) -9.417 (0.110) 7.795 (0.111) φ 3.5 10 5 (0.000)1.4 10 5 (0.001) 0.003 (0.001) 2.001 (0.000) 1.000 (0.312) 4.358 (1.336) 1.010 (0.372) 1.900 (0.131) p 1.420 (0.000) 1.351 (0.035) 1.501 (0.004) 1.701 (0.000) 2.010 (0.007) 1.996 (0.014) 2.002 (0.002) 2.100 (0.000) paic 3307.201 542.176 3310.951 545.337 9208.462 1749.206 Rahma ABID, Poisson-exponential-Tweedie regression models 22

5. Conclusion & Perspectives 1 Model selection in the PET to deal with count ultra-overdispersed data. 2 Negative binomial dispersion and zero-inflation indexes relativize the ultra-overdispersion and the excess of zeros. Rahma ABID, Poisson-exponential-Tweedie regression models 23

5. Conclusion & Perspectives 1 Model selection in the PET to deal with count ultra-overdispersed data. 2 Negative binomial dispersion and zero-inflation indexes relativize the ultra-overdispersion and the excess of zeros. 3 Multivariate version of PET sums: G 1 (Q) G k (Q) S (Q; PT) = PT l1,..., PT lk, l=1 l=1 where Q = {q ij } k is a suitable matrix of parameters. i,j=1 Independent or correlated components ( G j (Q) ) k j=1? 4 Given Y 1,..., Y n be a n-variate response vector on N d, d 1, EY i = m i = G 1 (X i β), cov(y i, Y j ) = Σ 1/2 i (ρ ij I d ) Σ 1/2 j, Σ i = diag d (m i ) + diag d (m 2 i ) + diag d (mp i ) 1/2 diag d (φ)diag d (m p i ) 1/2. Rahma ABID, Poisson-exponential-Tweedie regression models 23

Further references 1 Abid, R., Kokonendji, C.C. and Masmoudi, A. (2018a). Geometric dispersion models with quadratic v-functions. Submission Needing Revision. 2 Abid, R., Kokonendji, C.C. and Masmoudi, A. (2018b). Geometric Tweedie regression models for continuous and semicontinuous data with variation phenomenon. Submitted for publication. 3 Abid, R., Kokonendji, C.C. and Masmoudi, A. (2018c). On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data. To submit asap. 4 Dunn, P.K. (2013). Tweedie exponential family models. version 2.1.7. R package URL http://cran.r-project.org/web/packages/tweedie/tweedie. 5 Jørgensen, B. and Knudsen, S.J. (2004). Parameter orthogonality and bias adjustment for estimating functions. Scandinavian Journal of Statistics 31:93-114. 6 Klugman, S.A., Panger, H.H. and Willmot, G.E. (2004). Loss Models: From Data to Decisions, 2nd edn. Wiley, Hoboken, NJ. 7 Tweedie, M.C.K. (1984). An index which distinguishes between some important exponential families. In Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference (J. K. Ghosh and J. Roy, eds.), pp. 579-604, Indian Statistical Institute, Calcutta.... Thank You Rahma ABID, Poisson-exponential-Tweedie regression models 24

Geometric dispersion models (0) Let µ be a probability measure. (1) The geometric cumulant function of µ (Jørgensen & Kokonendji, 2011) is C µ (θ) = 1 1 L µ (θ) on (µ) = { θ R; 0 < L µ (θ) < }. (2) In general, the application θ C µ(θ) is not strictly monotone on (µ). -Let (µ) (µ) be an interval for which C µ is strictly monotone on (µ). -The application θ C µ(θ) is a diffeomorphism between (µ) and C µ ( (µ)) =: Φ µ. Denote ϕ µ := (C µ) 1. (3) v-function: m V µ (m) = C µ (ϕ µ )(m) on Φ µ. -Denote Φ µ := C µ( (µ)) and Φ + µ := C µ( + (µ)). Then, V µ (m) < 0 on Φ µ and V µ (m) > 0 on Φ + µ. Note: V µ = Var E 2. Rahma ABID, Poisson-exponential-Tweedie regression models 25

How to derive GDMs from EDMs with (µ) = (µ)? Let µ be a probability measure. (1) If there exists a probability measure ν such that C µ (θ) = K ν (θ), then (µ) = {θ (µ); C µ (θ) = K ν (θ) > 0} = (µ). ν(µ) : Prop. below. (2) If there exists a probability measure ν such that C µ (θ) = K ν (θ), then (µ) = {θ (µ); C µ (θ) = K (θ) < 0} = (µ). ν(µ)?! ν Proposition (Exponential mixtures distributions) Let µ be a probability measure and ν an infinitely divisible σ-finite positive measure. Consider F(ν) = {P(θ, ν); θ Θ(ν)}. The following assertions are equivalent: (i) For all m Φ + µ, V µ (m) = V F(ν) (m). (ii) The measure µ is an exponential mixture measure µ(dy) = e x P( γ, ν x )(dy)dx, (3) 0 with γ Θ(ν) and ν x denotes the x-th convolution of ν, that is L ν x (θ) = (L ν (θ)) x. Note: Under assumption of infinite divisibility, the corresponding exponential mixture has v-function identical to the variance function V F(ν). Rahma ABID, Poisson-exponential-Tweedie regression models 26