Nonparametric estimation of the number of zeros in truncated count distributions

Size: px
Start display at page:

Download "Nonparametric estimation of the number of zeros in truncated count distributions"

Transcription

1 Nonparametric estimation of the number of zeros in truncated count distributions Célestin C. KOKONENDJI University of Franche-Comté, France Laboratoire de Mathématiques de Besançon - UMR 6623 CNRS-UFC celestin.kokonendji@univ-fcomte.fr Seminar of IRP on Statistical Advances for Complex Data CRM, Bellaterra : Joint work with Pere Puig, UAB 1 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

2 Acknowledgements : Centre de Recerca Matemàtica (CRM) : Intensive Research Program (IRP) on Statistical Advances for Complex Data > Moltes Gràcies 2 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

3 Acknowledgements : Centre de Recerca Matemàtica (CRM) : Intensive Research Program (IRP) on Statistical Advances for Complex Data Universitat Autònoma de Barcelona (UAB) : Departament de Matemàtiques & Servei d Estadistica Aplicada > Moltes Gràcies 2 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

4 Acknowledgements : Centre de Recerca Matemàtica (CRM) : Intensive Research Program (IRP) on Statistical Advances for Complex Data Universitat Autònoma de Barcelona (UAB) : Departament de Matemàtiques & Servei d Estadistica Aplicada Pere PUIG : Invitation to the IRP on Statistical Advances for Complex Data ( Multivariate over-equi- and underdispersion, in progress) DoReMi Workshop & Seminari del DEIO (UPC) with Marta Perez-Casany (also for Barcelona s & Sitges Visits) Many excursions (e.g. Costa Brava), Castanyada, etc. > Moltes Gràcies 2 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

5 Outline : Title : Nonparametric estimation of the number of zeros in truncated count distributions 1 Iintroduction 2 Count distributions with log-convex pgf 3 Fascination to lower bounds of p 4 Estimating the non-observed number of zeros 5 Applications 3 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

6 Iintroduction : Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Cholera data set of McKendrick Number of words knew but unused by Shakespeare Number of grizzly bear females in Yellowstone In many practical situations the researcher is not able to observe the entire distribution of counts in an experiment. In particular the zeros often are not observed, leading to the so called (zero)-truncated count data. For instance : capture-recapture models, used in Biology and Ecology. This is a methodology commonly used to estimate an animal population s size. In many cases the estimation of the not observed number of zeros is an important issue : 4 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

7 Cholera data set of McKendrick i Probably the oldest example of estimation of the number of zeros is that of Mckendrick (1926), who analyzed the number of individuals with cholera in 223 households in a village in India : No. of infections No. of households (frequency) (168) !! McKendrick argued that a household with no cases of cholera could be because its members had not been exposed or because they had been exposed but they had not been infected.?! McKendrick wanted to estimate the number of individuals who were exposed but did not develop the symptoms. To do this, he ignored the 168 households with zero cases and he developed an estimator of the number of zeros using the other observations based on the zero-truncated Poisson distribution (?). i. McKendrick, A. (1926). Application of mathematics to medical problems. Proc. Edinb. Math. Soc. 44, Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

8 Number of words knew but unused by Shakespeare ii?! Another interesting example arises answering to the following question : How many words did Shakespeare know? The information to be taken into account is that Shakespeare wrote different words, of which words were used exactly once, 4343 words were used exactly twice, 2292 were used exactly three times, and so forth. Here is a reduced version of the full table reported in Efron and Thisted : Ocurrences No. of words (frequency)? In this problem (?) the frequency of zeros to be estimated would represent the number of words that Shakespeare knew but did not use in any of his known works. ii. Efron, B., Thisted, R. (1976). Estimating number of unseen species - How many words did Shakespeare know? Biometrika 63, Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

9 Number of grizzly bear females in Yellowstone iii Most of the practical examples related with the estimation of the number of zeros are related to the capture-recapture sampling scheme. Keating et al (22) studied the annual numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population, from 1986 to 21. It is shown below the number of unique females with cubs-of-the-year that were seen exactly j times during the year 1998 : Sights No. of bears (frequency)? Each sight is considered as a "capture", so that 11 females has been captured exactly once, 13 has been captured twice, and so forth. In this case, the number of bears that has been observed is just 33. The frequency of zeros f represents the number of bears not observed, so that the total number of grizzly bear females would be 33 + f. iii. Keating, K., Schwartz, C., Haroldson, M., Moody, D. (22). Estimating numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population. URSUS 13, Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

10 Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Discrete Compound Poisson distributions Mixed Poisson distributions Log-convexity class Count distributions with log-convex pgf!! Very wide class count Compound(?) and Mixed(?) Poisson?! Examples with Differences ( Desigual )!! Overdispersion (to Poisson)!! Zero-inflation (to Poisson) Siméon Denis Poisson ( ) 8 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

11 Discrete Compound Poisson distributions A r.v. X follows a discrete Compound-Poisson (dcp) distribution if X = N Y i, with pgf Φ X (t) := Et X = i=1 t k P(X = k) = exp{ λ[1 Ψ(t)]}, N Poisson(λ) and Y 1, Y 2,... are iid count r.v. s, also independent of N with pgf Ψ( ). The dcp distr. constitute a huge family of count distr. acording to : k= See, e.g., Johnson et al (25) and Steutel and van Harn (24) for properties, formulae and algorithms to calculate the probabilities. Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

12 Discrete Compound Poisson distributions A r.v. X follows a discrete Compound-Poisson (dcp) distribution if X = N Y i, with pgf Φ X (t) := Et X = i=1 t k P(X = k) = exp{ λ[1 Ψ(t)]}, N Poisson(λ) and Y 1, Y 2,... are iid count r.v. s, also independent of N with pgf Ψ( ). The dcp distr. constitute a huge family of count distr. acording to : Feller s characterization : The dcp are the only one discrete distributions that are infinitely divisible. See, e.g., Johnson et al (25) and Steutel and van Harn (24) for properties, formulae and algorithms to calculate the probabilities. k= Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

13 Discrete Compound Poisson distributions A r.v. X follows a discrete Compound-Poisson (dcp) distribution if X = N Y i, with pgf Φ X (t) := Et X = i=1 t k P(X = k) = exp{ λ[1 Ψ(t)]}, N Poisson(λ) and Y 1, Y 2,... are iid count r.v. s, also independent of N with pgf Ψ( ). The dcp distr. constitute a huge family of count distr. acording to : Feller s characterization : The dcp are the only one discrete distributions that are infinitely divisible. See, e.g., Johnson et al (25) and Steutel and van Harn (24) for properties, formulae and algorithms to calculate the probabilities. Examples of dcp distributions : Hermite, negative binomial, strict arcsine, Poisson-Tweedie, Hinde-Demétrio a a. Kokonendji,C.C., Dossou-Gbété,S., Demétrio,C.G.B. (24). Some discrete exponential dispersion models : Poisson-Tweedie and Hinde-Demétrio classes. SORT 28, k= Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count dis

14 Mixed Poisson distributions iv A r.v. X follows a Mixed-Poisson (MP) distribution on N := {, 1,...} if p k := P(X = k) = where F is a distribution function on [, ). Examples of F (MP) distributions : λ λk e k! df(λ), with Φ X(t) = e λ(1 t) df(λ), Poisson (Neyman A), gamma (negative binomial), inverse-gaussian (Sichel or PIG), Tweedie positive stables (Poisson-Tweedie), F for finite supports. iv. Grandell, J. (1997). Mixed Poisson Processes. Chapman & Hall, London. Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

15 Mixed Poisson distributions iv A r.v. X follows a Mixed-Poisson (MP) distribution on N := {, 1,...} if p k := P(X = k) = where F is a distribution function on [, ). Examples of F (MP) distributions : λ λk e k! df(λ), with Φ X(t) = e λ(1 t) df(λ), Poisson (Neyman A), gamma (negative binomial), inverse-gaussian (Sichel or PIG), Tweedie positive stables (Poisson-Tweedie), F for finite supports. Remark : all Poisson-Tweedie (PTw) (MP dcp) ; PTw HD = {NB}. MP with F for finite supports dcp. dcp (Hermite strict arcsine HD\NB) MP. iv. Grandell, J. (1997). Mixed Poisson Processes. Chapman & Hall, London. Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

16 Class of log-convexity pgf : Proposition () Let X be a discrete r.v., Compound- or Mixed-Poisson distributed, with pgf Φ X ( ). Then log Φ X ( ) is a convex function in [, 1]. Proof : Easy for dcp. As for MP [Φ Φ (Φ ) 2 ], let dg t (λ) = e λ(1 t) df(λ) : ( 2 λ 2 dg t (λ) dg t (λ) λdg t (λ)) (Cauchy Schwartz). Class of count distributions with log-convex pgf is wider than (dcp MP). 1 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

17 Class of log-convexity pgf : Proposition () Let X be a discrete r.v., Compound- or Mixed-Poisson distributed, with pgf Φ X ( ). Then log Φ X ( ) is a convex function in [, 1]. Proof : Easy for dcp. As for MP [Φ Φ (Φ ) 2 ], let dg t (λ) = e λ(1 t) df(λ) : ( 2 λ 2 dg t (λ) dg t (λ) λdg t (λ)) (Cauchy Schwartz). Properties Log-convexity Overdispersion (VarX EX) and Zero-inflation (p e EX ). Class of count distributions with log-convex pgf is wider than (dcp MP). 1 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

18 Class of log-convexity pgf : Proposition () Let X be a discrete r.v., Compound- or Mixed-Poisson distributed, with pgf Φ X ( ). Then log Φ X ( ) is a convex function in [, 1]. Proof : Easy for dcp. As for MP [Φ Φ (Φ ) 2 ], let dg t (λ) = e λ(1 t) df(λ) : ( 2 λ 2 dg t (λ) dg t (λ) λdg t (λ)) (Cauchy Schwartz). Properties Log-convexity Overdispersion (VarX EX) and Zero-inflation (p e EX ). Class of count distributions with log-convex pgf is wider than (dcp MP). Example & Desigual Φ X (t) = 1/5 + t/5 + t 2 /5 + t 3 /2 + 7t 4 /2 is a log-convex function in [, 1] but X is not in (dcp MP) by direct calculations. 1 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

19 Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Some lower bounds of p An improved inequality Fascination to lower bounds of p from Desigual to = 1 + e iπ 12 Célestin C. K OKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

20 Some lower bounds of p : Part I (dcp MP) Proposition (I) Let X be a discrete r.v. Compound- or Mixed-Poisson distributed. Then ( ) k + r p k+r p p k p r, k, r 1, (1) r where p k = P(X = k), k {, 1, 2,...}. Set of lower bounds of p : (1) implies p p k p r ( k+r r )p k+r, k, r 1. (2) Remark : (i) the equalities in (1) or (2) are satisfied iff X is Poisson distributed. (ii) k = r = 1 for the well-known Chao s (1987) lower bound (Böhning, 21) p p2 1 2p 2. (3) 3 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

21 Some lower bounds of p : Part II (Log-convexity) In general, Log-convexity does not satisfy the inequalities (1) or (2) ; cf. the preceding Example & Desigual with 3p 3 p < p 1 p 2. Besides, Log-convexity allows other p -inequalities, involving also the population mean and again Chao s lower bound : Proposition (II) Let X be a discrete r.v. with a log-convex pgf Φ X ( ) in [, 1], such that E(X) = µ. Then, i. p exp( µ) : (Poisson) zero-inflation ii. p p 1 /µ µ p 1 /p : Turing s estimator (Good, 1953) iii. p p 2 1 /(2p 2) : Chao s lower bound. Note : - Equalities in (i)-(iii) are satisfied for Poisson distribution. - The inequalities (i)-(iii) are well known either for both or for one of Compound- and Mixed-Poisson distributions. 14 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

22 An improved inequality : Part III Let X be a r.v. Compound- or Mixed-Poisson distributed with E(X) = µ. Because all the inequalities in (2) and in Prop.(II) are satisfied, a sharper lower bound of p can be obtained taking the maximum of all them. Concretely, (?) p M := max r,k p k p r ( k+r r )p k+r p max { p M, exp( µ), p 1 /µ }. (4) Lemma (Lanumteang & Böhning (211), in proof of their Th.1) Let X be a discrete r.v. Mixed-Poisson distributed, then Proposition (III) p 1 2p 2 3p 3... kp k... p p 1 p 2 p k 1 Under Mixed Poisson : p M := max r,k p k p r ( k+r r )p k+r = p2 1. 2p 2 15 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

23 Example 1 : Negative binomial = (HD PTw) MP ( ) φ ( ) k φ µ Γ(φ + k) p k =, k =, 1, 2,... φ + µ φ + µ k!γ(φ) with mean µ and parameter of shape φ >. Direct calculations show that, ( ) φ p k p r φ Γ(φ + k)γ(φ + r) =, k, r = 1, 2,... ( k+r r )p k+r φ + µ Γ(φ + k + r)γ(φ) & its maximum is attained for k = r = 1,i.e., at the Chao s lower bound (3). It agrees with Prop. (III) because NB is a Mixed Poisson. Consequently, ( ) φ ( ) φ+1 φ φ p M = φ + µ φ + 1. Because p φ M, µ 1, φ + µ direct calculations show that the inequality (4) remains, ( ) φ φ φ p max φ + µ φ + 1, exp( µ). (5) The maximum in the right part of (5) is attained at exp( µ), for < µ µ, and at p M, for µ µ, where µ is the solution of the equation exp( µ) = p M. 16 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

24 Example 2 : Hermite of 3rd order v (dcp \ MP) Consider X a count r.v. Compound-Poisson where the compounding distribution takes a finite range of values,, 1, 2 and 3. It leads to a third-order Hermite distribution, that can be represented as : X = X 1 + 2X 2 + 3X 3, with iid X i P(λ i ). Its probabilities, p k = P(X = k), can be calculated using the recursive relation, p k = (p k 1 λ 1 + 2p k 2 λ 2 + 3p k 3 λ 3 )/k where p = exp( λ 1 λ 2 λ 3 ), and p 1 = p 2 =. This is a dcp \ MP, and consequently the value of p M in (4) is not always the Chao s lower bound. Indeed, taking λ 2 =.5 and λ 3 = 1 numerical calculations show that : - for λ 1 = 1.5 the maximum is at the Chao s lower bound, i.e. p M = p 2 1 /(2p 2), - for λ 1 = 2 the maximum is p M = p 1 p 2 /(3p 3 ), and - for λ 1 = 3 the maximum is p M = p 2 2 /(6p 4). v. Puig, P., Barquinero, J.F. (211). An application of compound Poisson modelling to biological d simetry. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 467 (2127), Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

25 Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Improved Chao estimate Turing estimate ZI-estimate Final result Estimating the non-observed number of zeros from Chao to = 1 + e iπ The Imitation Game Alan M. Turing ( ) and... How to apply these inequalities to the estimation of the number of zeros? 18 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

26 Improved Chao estimate Consider X a count r.v., with probabilities p k, k =, 1, 2,..., where only the zero-truncated r.v. X X > (of positive values) are observed. Let x = (x 1, x 2,..., x n ) a sample of size n of X X >, and let f k denote the number (frequency) of x i equal to k, k = 1, 2,..., m (m is the largest count observed in the sample). It is evident that f 1 + f f m = n. Let f denote the number of non-observed zeros, to be estimated. The size of the complete sample (counting the zeros) would be N = f + n (that represents the total number of individuals in the capture-recapture experiment). Taking into account that p i f i /N, the inequalities (2) lead to the following lower bound estimates of f, ˆf r,k = f k f r ( k+r r )f k+r, 1 k, r, k + r m. (6) The well known Chao s (1984, 1987) estimator of f is obtained for r = k = Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

27 Turing estimate The inequalities (i) and (ii) in Proposition (II) also allow to obtain lower bound estimates of f. The population mean µ in (i)-(ii) can be replaced by µ := s n + f, where s = + Then, inequality (ii) in Proposition (II) leads to, f f 1/(n + f ) n + f s/(n + f ), n x i. and isolating f we obtain the Turing s estimator of f, i=1 ˆf T = nf 1 s f 1. (7) Note : The so-called Good-Turing s estimator vi of the population size ˆN T = ˆf T + n = n/(1 f 1 /s) underestimates it for the (very wide family of) log-convex-pgf by Prop. (II). vi. See Good (1953), Chao & Lin (212), Chiu et al (214), for capture-recapture problems. 2 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

28 ZI-estimate Replacing again µ by µ := s/(n + f ) in the inequality (i) of Prop.(II) we obtain, ( ) f s exp n + f n + f x ( x ) 1 + x exp, 1 + x where x = f /n and x = s/n. From here, we define the zi-estimator of f, ˆf Z = nˆx, (8) where ˆx is the unique solution of the equation, ( ) x log (1 + x) = x. (9) 1 + x Note : This estimator is well defined because the left part of (9) is a decreasing function, becoming infinity at x = and tending to 1 as x grows, and x > Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

29 ZI-estimate Replacing again µ by µ := s/(n + f ) in the inequality (i) of Prop.(II) we obtain, ( ) f s exp n + f n + f x ( x ) 1 + x exp, 1 + x where x = f /n and x = s/n. From here, we define the zi-estimator of f, ˆf Z = nˆx, (8) where ˆx is the unique solution of the equation, ( ) x log (1 + x) = x. (9) 1 + x Note : This estimator is well defined because the left part of (9) is a decreasing function, becoming infinity at x = and tending to 1 as x grows, and x > 1. Set of (under)estimators of f : ˆf r,k, ˆfT, ˆfZ. 21 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

30 Final results of estimation Because ˆf r,k, ˆf T and ˆf Z underestimate f we propose to consider the estimator resulting maximizing all these estimators, that is, f k f r ˆf M = max r,k, 1 k, r, k + r m. ( k+r r Compound- or Mixed-Poisson )f k+r ˆf = max {ˆfM, ˆf Z, ˆf T }. (1) If ˆf C = f 2 1 /(2f 2) is the Chao s estimator (r = k = 1), it is suitable to consider Log-convex-pgf ˆf = max {ˆfC, ˆf Z, ˆf T }, (11) Remark : Variance of ˆf or ˆf is so complicated! Then, we suggest to use a bootstrap method to estimate the variance and the associated confidence interval for any given sample, 22 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

31 Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Three Examples of Application Cholera data set of McKendrick Number of words knew but unused by Shakespeare Number of grizzly bear females in Yellowstone Coming back to : 1 Cholera data set of the McKendrick s problem 2 Number of words knew but unused by Shakespeare 3 Number of grizzly bear females in Yellowstone 23 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

32 Cholera data set of McKendrick (1926) Cholera in 223 households in a village in India : No. of infections No. of households (frequency) (168) Result : ˆf M = 48, ˆf Z = , ˆf T = and ˆf C = 32. Here : ˆf M = (f 1 f 3 )/(4f 4 ) = 48. ˆf = 48 and ˆf = 33. Variability using 5 bootstrap samples (and CI by the quantile s method) : Estimator Mean SD 95% CI ˆf [26.5, 17.68] ˆf [22.38, 72.82] Note that the prior knowledge about the distributional pattern is important because the wide of the confidence interval in general is greater for ˆf. 24 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

33 Number of words knew but unused by Shakespeare A reduced version of the full table reported in Efron and Thisted (1976) : Ocurrences No. of words (frequency)? Result (using the full table) : ˆf M = , ˆf Z = , ˆf T = and ˆf C = ˆf = ˆf = ˆf C the Chao s estimator. The simulation of 1 bootstrap samples produces : Estimator Mean SD 95% CI ˆf [ , ] ˆf [ , ] 5 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

34 Number of grizzly bear females in Yellowstone Estimation of the population of grizzly bears females (Keating et al, 22) : Sights No. of bears (frequency)? Result : ˆf M = , ˆf Z = , ˆf T = and ˆf C = ˆf = 28 and ˆf = 6. Adding to the observed number of bears 33, the estimated population size is ˆN = 61 and ˆN = 39. The simulation of 5 bootstrap samples produces : Estimator Mean SD 95% CI ˆf [5.83, 6.17] ˆf [3.5, 16.] 26 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

35 Iintroduction Count distributions with log-convex pgf Fascination to lower bounds of p Estimating the non-observed number of zeros Applications Cholera data set of McKendrick Number of words knew but unused by Shakespeare Number of grizzly bear females in Yellowstone Jo mai perdo. O bé guanyo, o n aprenc. I never lose. I either win or I learn. 27 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

36 Supplementary References a. Böhning, D. (21). Some general comparative points on Chao s and Zelterman s estimators of the population size. Scand. J. Statist. 37, b. Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Statist. 11, c. Chao, A. (1987). Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, d. Chao, A., Lin, C.-W. (212). Nonparametric lower bounds for species richness and shared species richness under sampling without replacement. Biometrics 68, e. Chiu, C.-H., Wang, Y.-T., Walther, B.A., Chao, A. (214). An improved nonparametric lower bound of species richness via a modified Good-Turing frequency formula. Biometrics 7, f. Good, I.J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 4, g. Johnson, N.L., Kemp, A.W., Kotz, S. (25). Univariate Discrete Distributions (3rd ed.). Wiley, New Jersey. h. Kemp, A.W., Kemp, C.D. (1966). An alternative derivation of the hermite distribution. Biometrika 53, i. Lanumteang, K., Böhning, D. (211). An extension of Chao s estimator of population size based on the first three capture frequency counts. Comput. Statist. Data Anal. 55, j. Steutel, F.W., van Harn, K. (24). Infinite Divisibility of Probability Distributions on the Real Line (1st ed.). Dekker, New York. 8 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

37 Thanks - Gràcies - Merci - Singuila 29 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

38 Proof of Proposition (I) Steutel and van Harn (24, Chap.II, p.51) for the Compound-Poisson distributions. For the Mixed-Poisson distributions, note that the inequalities (1) are equivalent to, e λ λ k df(λ) e λ λ r df(λ) Defining the probability measure over the positive reals, dg(λ) = e λ λ r+k df(λ) e λ df(λ). (12) e λ df(λ) e λ df(λ), the inequality (12) can be written as, E(Y r )E(Y k ) E(Y k+r ), where Y is a positive r.v. with distribution G. It is well known that for any positive r.v. Y, E(Y s ) 1/s E(Y z ) 1/z, for all < s z (moment monotonicity). Without loss of generality we can assume that r k. Then, E(Y k+r ) E(Y k ) (k+r)/k = E(Y k )E(Y k ) r/k E(Y k )E(Y r ), and the proof is complete. 3 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

39 Proof of Proposition (II) (i) Due to the convexity, the tangent line to log(φ X (t)) at t = t is always lower than log(φ X (t)), that is, log(φ X (t)) Φ X (t ) Φ X (t ) (t t ) + log(φ X (t )). In particular, for t = 1, taking into account that Φ X (1) = µ and Φ X(1) = 1, we obtain log(φ X (t)) µ(t 1), and for t = it leads to log(p ) µ. (ii) Note that the first derivative of log(φ X (t)) is an increasing function for t [, 1]. In particular, the second inequality is deduced from Φ X () Φ X () Φ X (1) Φ X (1). (iii) The third inequality is a direct consequence of the pgf log-convexity at t =. Because log(φ X (t)) is a convex function, calculating the second derivative we obtain that Φ X (t)φ X(t) (Φ X (t))2. Evaluating this expression at t = the third inequality directly holds. Note : Evaluating at t = 1 the expression Φ X (t)φ X(t) (Φ X (t))2, we directly obtain that any count r.v. having a log-convex pgf is overdispersed. 31 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

40 Proof of Proposition (III) Because Lemma establishes the set of inequalities, p p2 1 p 2p 1 p k p , 2p 2 3p 3 (k + 1)p k+1 we need only to prove that p 1 p r, r = 2, 3,... (k + 1)p k+1 ( k+r r )p k+r This inequality is equivalent to e λ λ k+1 df(λ) e λ λ r df(λ) e λ λ r+k df(λ) λe λ df(λ) Similarly to the proof of Proposition (I), defining the probability measure dg(λ) = λe λ df(λ) λe λ df(λ), the inequality can be expressed as, E(Y r 1 )E(Y k ) E(Y k+r 1 ), where Y is a r.v. with distribution G. Using again the moment monotonicity the proof is completed. 32 Célestin C. KOKONENDJI & Pere PUIG Nonparametric estimation of the number of zeros in truncated count di

A Note on Weighted Count Distributions

A Note on Weighted Count Distributions Journal of Statistical Theory and Applications Volume 11, Number 4, 2012, pp. 337-352 ISSN 1538-7887 A Note on Weighted Count Distributions Célestin C. Kokonendji and Marta Pérez-Casany Abstract As particular

More information

The LmB Conferences on Multivariate Count Analysis

The LmB Conferences on Multivariate Count Analysis The LmB Conferences on Multivariate Count Analysis Title: On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data Rahma ABID, C.C. Kokonendji & A. Masmoudi Email Address: rahma.abid.ch@gmail.com

More information

Overdispersion and underdispersion characterization of weighted Poisson distributions

Overdispersion and underdispersion characterization of weighted Poisson distributions Overdispersion and underdispersion characterization of weighted Poisson distributions Célestin C. Kokonendji a; and a University of Pau - LMA UMR 5142 CNRS Pau, France Dominique Mizère a;b b Marien Ngouabi

More information

On Lévy measures for in nitely divisible natural exponential families

On Lévy measures for in nitely divisible natural exponential families On Lévy measures for in nitely divisible natural exponential families Célestin C. Kokonendji a;, a University of Pau - LMA & IUT STID, Pau, France Mohamed Khoudar b b University of Pau - IUT STID, Pau,

More information

Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds

Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds Dankmar Böhning Quantitative Biology and Applied Statistics, School of Biological

More information

18.175: Lecture 17 Poisson random variables

18.175: Lecture 17 Poisson random variables 18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 1 Outline More on random walks and local CLT Poisson random variable convergence Extend CLT idea to stable random variables 2 Outline More

More information

Monotonicity and Aging Properties of Random Sums

Monotonicity and Aging Properties of Random Sums Monotonicity and Aging Properties of Random Sums Jun Cai and Gordon E. Willmot Department of Statistics and Actuarial Science University of Waterloo Waterloo, Ontario Canada N2L 3G1 E-mail: jcai@uwaterloo.ca,

More information

On Equi-/Over-/Underdispersion. and Related Properties of Some. Classes of Probability Distributions. Vladimir Vinogradov

On Equi-/Over-/Underdispersion. and Related Properties of Some. Classes of Probability Distributions. Vladimir Vinogradov On Equi-/Over-/Underdispersion and Related Properties of Some Classes of Probability Distributions Vladimir Vinogradov (Ohio University, on leave at the Fields Institute, University of Toronto and York

More information

A NOTE ON THE INFINITE DIVISIBILITY OF SKEW-SYMMETRIC DISTRIBUTIONS

A NOTE ON THE INFINITE DIVISIBILITY OF SKEW-SYMMETRIC DISTRIBUTIONS A NOTE ON THE INFINITE DIVISIBILITY OF SKEW-SYMMETRIC DISTRIBUTIONS J. Armando Domínguez-Molina and Alfonso Rocha-Arteaga Comunicación Técnica No I-04-07/7-08-004 (PE/CIMAT) A Note on the In nite Divisibility

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

ON GENERALIZED VARIANCE OF NORMAL-POISSON MODEL AND POISSON VARIANCE ESTIMATION UNDER GAUSSIANITY

ON GENERALIZED VARIANCE OF NORMAL-POISSON MODEL AND POISSON VARIANCE ESTIMATION UNDER GAUSSIANITY ON GENERALIZED VARIANCE OF NORMAL-POISSON MODEL AND POISSON VARIANCE ESTIMATION UNDER GAUSSIANITY Khoirin Nisa 1, Célestin C. Kokonendji 2, Asep Saefuddin 3, AjiHamim Wigena 3 and I WayanMangku 4 1 Department

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Discrete Distributions Chapter 6

Discrete Distributions Chapter 6 Discrete Distributions Chapter 6 Negative Binomial Distribution section 6.3 Consider k r, r +,... independent Bernoulli trials with probability of success in one trial being p. Let the random variable

More information

CHAO, JACKKNIFE AND BOOTSTRAP ESTIMATORS OF SPECIES RICHNESS

CHAO, JACKKNIFE AND BOOTSTRAP ESTIMATORS OF SPECIES RICHNESS IJAMAA, Vol. 12, No. 1, (January-June 2017), pp. 7-15 Serials Publications ISSN: 0973-3868 CHAO, JACKKNIFE AND BOOTSTRAP ESTIMATORS OF SPECIES RICHNESS CHAVAN KR. SARMAH ABSTRACT: The species richness

More information

Katz Family of Distributions and Processes

Katz Family of Distributions and Processes CHAPTER 7 Katz Family of Distributions and Processes 7. Introduction The Poisson distribution and the Negative binomial distribution are the most widely used discrete probability distributions for the

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.0 Discrete distributions in statistical analysis Discrete models play an extremely important role in probability theory and statistics for modeling count data. The use of discrete

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Extended Poisson Tweedie: Properties and regression models for count data

Extended Poisson Tweedie: Properties and regression models for count data Extended Poisson Tweedie: Properties and regression models for count data Wagner H. Bonat 1,2, Bent Jørgensen 2,Célestin C. Kokonendji 3, John Hinde 4 and Clarice G. B. Demétrio 5 1 Laboratory of Statistics

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

On discrete distributions with gaps having ALM property

On discrete distributions with gaps having ALM property ProbStat Forum, Volume 05, April 202, Pages 32 37 ISSN 0974-3235 ProbStat Forum is an e-journal. For details please visit www.probstat.org.in On discrete distributions with gaps having ALM property E.

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Extended Poisson-Tweedie: properties and regression models for count data

Extended Poisson-Tweedie: properties and regression models for count data Extended Poisson-Tweedie: properties and regression models for count data arxiv:1608.06888v2 [stat.me] 11 Sep 2016 Wagner H. Bonat and Bent Jørgensen and Célestin C. Kokonendji and John Hinde and Clarice

More information

A Note on Certain Stability and Limiting Properties of ν-infinitely divisible distributions

A Note on Certain Stability and Limiting Properties of ν-infinitely divisible distributions Int. J. Contemp. Math. Sci., Vol. 1, 2006, no. 4, 155-161 A Note on Certain Stability and Limiting Properties of ν-infinitely divisible distributions Tomasz J. Kozubowski 1 Department of Mathematics &

More information

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3) STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it

More information

STAT/MATH 395 PROBABILITY II

STAT/MATH 395 PROBABILITY II STAT/MATH 395 PROBABILITY II Chapter 6 : Moment Functions Néhémy Lim 1 1 Department of Statistics, University of Washington, USA Winter Quarter 2016 of Common Distributions Outline 1 2 3 of Common Distributions

More information

PANJER CLASS UNITED One formula for the probabilities of the Poisson, Binomial, and Negative Binomial distribution.

PANJER CLASS UNITED One formula for the probabilities of the Poisson, Binomial, and Negative Binomial distribution. PANJER CLASS UNITED One formula for the probabilities of the Poisson, Binomial, and Negative Binomial distribution Michael Facler 1 Abstract. This paper gives a formula representing all discrete loss distributions

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

18.175: Lecture 13 Infinite divisibility and Lévy processes

18.175: Lecture 13 Infinite divisibility and Lévy processes 18.175 Lecture 13 18.175: Lecture 13 Infinite divisibility and Lévy processes Scott Sheffield MIT Outline Poisson random variable convergence Extend CLT idea to stable random variables Infinite divisibility

More information

Parameter addition to a family of multivariate exponential and weibull distribution

Parameter addition to a family of multivariate exponential and weibull distribution ISSN: 2455-216X Impact Factor: RJIF 5.12 www.allnationaljournal.com Volume 4; Issue 3; September 2018; Page No. 31-38 Parameter addition to a family of multivariate exponential and weibull distribution

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

Conditional Maximum Likelihood Estimation of the First-Order Spatial Non-Negative Integer-Valued Autoregressive (SINAR(1,1)) Model

Conditional Maximum Likelihood Estimation of the First-Order Spatial Non-Negative Integer-Valued Autoregressive (SINAR(1,1)) Model JIRSS (205) Vol. 4, No. 2, pp 5-36 DOI: 0.7508/jirss.205.02.002 Conditional Maximum Likelihood Estimation of the First-Order Spatial Non-Negative Integer-Valued Autoregressive (SINAR(,)) Model Alireza

More information

Octavio Arizmendi, Ole E. Barndorff-Nielsen and Víctor Pérez-Abreu

Octavio Arizmendi, Ole E. Barndorff-Nielsen and Víctor Pérez-Abreu ON FREE AND CLASSICAL TYPE G DISTRIBUTIONS Octavio Arizmendi, Ole E. Barndorff-Nielsen and Víctor Pérez-Abreu Comunicación del CIMAT No I-9-4/3-4-29 (PE /CIMAT) On Free and Classical Type G Distributions

More information

Package SPECIES. R topics documented: April 23, Type Package. Title Statistical package for species richness estimation. Version 1.

Package SPECIES. R topics documented: April 23, Type Package. Title Statistical package for species richness estimation. Version 1. Package SPECIES April 23, 2011 Type Package Title Statistical package for species richness estimation Version 1.0 Date 2010-01-24 Author Ji-Ping Wang, Maintainer Ji-Ping Wang

More information

1. Let X and Y be independent exponential random variables with rate α. Find the densities of the random variables X 3, X Y, min(x, Y 3 )

1. Let X and Y be independent exponential random variables with rate α. Find the densities of the random variables X 3, X Y, min(x, Y 3 ) 1 Introduction These problems are meant to be practice problems for you to see if you have understood the material reasonably well. They are neither exhaustive (e.g. Diffusions, continuous time branching

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Chapter 2. Discrete Distributions

Chapter 2. Discrete Distributions Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation

More information

Empirical Comparison of ML and UMVU Estimators of the Generalized Variance for some Normal Stable Tweedie Models: a Simulation Study

Empirical Comparison of ML and UMVU Estimators of the Generalized Variance for some Normal Stable Tweedie Models: a Simulation Study Applied Mathematical Sciences, Vol. 10, 2016, no. 63, 3107-3118 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2016.69238 Empirical Comparison of and Estimators of the Generalized Variance for

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

Approximating the negative moments of the Poisson distribution

Approximating the negative moments of the Poisson distribution Approximating the negative moments of the Poisson distribution C. Matthew Jones, 2 and Anatoly A. Zhigljavsky School of Mathematics, Cardiff University, CF24 4YH, UK. 2 Cardiff Research Consortium, Cardiff

More information

The Distributions of Stopping Times For Ordinary And Compound Poisson Processes With Non-Linear Boundaries: Applications to Sequential Estimation.

The Distributions of Stopping Times For Ordinary And Compound Poisson Processes With Non-Linear Boundaries: Applications to Sequential Estimation. The Distributions of Stopping Times For Ordinary And Compound Poisson Processes With Non-Linear Boundaries: Applications to Sequential Estimation. Binghamton University Department of Mathematical Sciences

More information

Notes 9 : Infinitely divisible and stable laws

Notes 9 : Infinitely divisible and stable laws Notes 9 : Infinitely divisible and stable laws Math 733 - Fall 203 Lecturer: Sebastien Roch References: [Dur0, Section 3.7, 3.8], [Shi96, Section III.6]. Infinitely divisible distributions Recall: EX 9.

More information

arxiv: v1 [math.pr] 10 Oct 2017

arxiv: v1 [math.pr] 10 Oct 2017 Yet another skew-elliptical family but of a different kind: return to Lemma 1 arxiv:1710.03494v1 [math.p] 10 Oct 017 Adelchi Azzalini Dipartimento di Scienze Statistiche Università di Padova Italia Giuliana

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Random Bernstein-Markov factors

Random Bernstein-Markov factors Random Bernstein-Markov factors Igor Pritsker and Koushik Ramachandran October 20, 208 Abstract For a polynomial P n of degree n, Bernstein s inequality states that P n n P n for all L p norms on the unit

More information

On a simple construction of bivariate probability functions with fixed marginals 1

On a simple construction of bivariate probability functions with fixed marginals 1 On a simple construction of bivariate probability functions with fixed marginals 1 Djilali AIT AOUDIA a, Éric MARCHANDb,2 a Université du Québec à Montréal, Département de mathématiques, 201, Ave Président-Kennedy

More information

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65 APM 504: Probability Notes Jay Taylor Spring 2015 Jay Taylor (ASU) APM 504 Fall 2013 1 / 65 Outline Outline 1 Probability and Uncertainty 2 Random Variables Discrete Distributions Continuous Distributions

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

Conditional distributions (discrete case)

Conditional distributions (discrete case) Conditional distributions (discrete case) The basic idea behind conditional distributions is simple: Suppose (XY) is a jointly-distributed random vector with a discrete joint distribution. Then we can

More information

S n = x + X 1 + X X n.

S n = x + X 1 + X X n. 0 Lecture 0 0. Gambler Ruin Problem Let X be a payoff if a coin toss game such that P(X = ) = P(X = ) = /2. Suppose you start with x dollars and play the game n times. Let X,X 2,...,X n be payoffs in each

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Reliable Inference in Conditions of Extreme Events. Adriana Cornea

Reliable Inference in Conditions of Extreme Events. Adriana Cornea Reliable Inference in Conditions of Extreme Events by Adriana Cornea University of Exeter Business School Department of Economics ExISta Early Career Event October 17, 2012 Outline of the talk Extreme

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple

More information

Log-concave distributions: definitions, properties, and consequences

Log-concave distributions: definitions, properties, and consequences Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS* LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Applications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University

Applications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University i Applications of Basu's TheorelTI by '. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University January 1997 Institute of Statistics ii-limeo Series

More information

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Martin J. Wolfsegger Department of Biostatistics, Baxter AG, Vienna, Austria Thomas Jaki Department of Statistics, University of South Carolina,

More information

On Hinde-Demetrio Regression Models for Overdispersed Count Data

On Hinde-Demetrio Regression Models for Overdispersed Count Data On Hinde-Demetrio Regression Models for Overdispersed Count Data Célestin Kokonendji, Clarice G.B. Demétrio, Silvio S. Zocchi To cite this version: Célestin Kokonendji, Clarice G.B. Demétrio, Silvio S.

More information

Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory

Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory E. Gómez-Déniz a and E. Calderín Ojeda b Abstract Bivariate distributions, specified in terms

More information

Probability for Statistics and Machine Learning

Probability for Statistics and Machine Learning ~Springer Anirban DasGupta Probability for Statistics and Machine Learning Fundamentals and Advanced Topics Contents Suggested Courses with Diffe~ent Themes........................... xix 1 Review of Univariate

More information

ON THE MOMENTS OF ITERATED TAIL

ON THE MOMENTS OF ITERATED TAIL ON THE MOMENTS OF ITERATED TAIL RADU PĂLTĂNEA and GHEORGHIŢĂ ZBĂGANU The classical distribution in ruins theory has the property that the sequence of the first moment of the iterated tails is convergent

More information

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get 18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in

More information

arxiv: v1 [stat.me] 28 Mar 2011

arxiv: v1 [stat.me] 28 Mar 2011 arxiv:1103.5447v1 [stat.me] 28 Mar 2011 On matrix variance inequalities G. Afendras and N. Papadatos Department of Mathematics, Section of Statistics and O.R., University of Athens, Panepistemiopolis,

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software April 2011, Volume 40, Issue 9. http://www.jstatsoft.org/ SPECIES: An R Package for Species Richness Estimation Ji-Ping Wang Northwestern University Abstract We introduce

More information

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment: Moments Lecture 10: Central Limit Theorem and CDFs Sta230 / Mth 230 Colin Rundel Raw moment: Central moment: µ n = EX n ) µ n = E[X µ) 2 ] February 25, 2014 Normalized / Standardized moment: µ n σ n Sta230

More information

Songklanakarin Journal of Science and Technology SJST R2 Atikankul

Songklanakarin Journal of Science and Technology SJST R2 Atikankul The new Poisson mixed weighted Lindley distribution with applications to insurance claims data Journal: Songklanakarin Journal of Science and Technology Manuscript ID SJST--.R Manuscript Type: Date Submitted

More information

Multivariate Normal-Laplace Distribution and Processes

Multivariate Normal-Laplace Distribution and Processes CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by

More information

ACM 116: Lectures 3 4

ACM 116: Lectures 3 4 1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

University of Lisbon, Portugal

University of Lisbon, Portugal Development and comparative study of two near-exact approximations to the distribution of the product of an odd number of independent Beta random variables Luís M. Grilo a,, Carlos A. Coelho b, a Dep.

More information

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang. Florida State University March 1, 2018 Framework 1. (Lizhe) Basic inequalities Chernoff bounding Review for STA 6448 2. (Lizhe) Discrete-time martingales inequalities via martingale approach 3. (Boning)

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion DATA DISPERSION: NOW YOU SEE IT... NOW YOU DON T Kimberly F. Sellers Department of Mathematics and Statistics Georgetown University Washington, DC 20057 kfs7@georgetown.edu Galit Shmueli Indian School

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Week 9 The Central Limit Theorem and Estimation Concepts

Week 9 The Central Limit Theorem and Estimation Concepts Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population

More information

Order Statistics for Negative Binomial Distribution

Order Statistics for Negative Binomial Distribution Order Statistics for Negative Binomial Distribution Tabark Ayad Ali Hussain Ahmed Ali University of Babylon /College of Education for Pure Sciences sdarser@gmail.com Abstract In this paper, we considers

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Exponential Families

Exponential Families Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

EXTENSIONS OF KATZ PANJER FAMILIES OF DISCRETE DISTRIBUTIONS

EXTENSIONS OF KATZ PANJER FAMILIES OF DISCRETE DISTRIBUTIONS REVSTAT Statistical Journal Volume 2, Number 2, November 2004 EXTENSIONS OF KATZ PANJER FAMILIES OF DISCRETE DISTRIBUTIONS Authors: Dinis D. Pestana Departamento de Estatística e Investigação Operacional,

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

A skew Laplace distribution on integers

A skew Laplace distribution on integers AISM (2006) 58: 555 571 DOI 10.1007/s10463-005-0029-1 Tomasz J. Kozubowski Seidu Inusah A skew Laplace distribution on integers Received: 17 November 2004 / Revised: 24 February 2005 / Published online:

More information

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with

More information

Statistical distributions: Synopsis

Statistical distributions: Synopsis Statistical distributions: Synopsis Basics of Distributions Special Distributions: Binomial, Exponential, Poisson, Gamma, Chi-Square, F, Extreme-value etc Uniform Distribution Empirical Distributions Quantile

More information

Norm inequalities related to the matrix geometric mean

Norm inequalities related to the matrix geometric mean isid/ms/2012/07 April 20, 2012 http://www.isid.ac.in/ statmath/eprints Norm inequalities related to the matrix geometric mean RAJENDRA BHATIA PRIYANKA GROVER Indian Statistical Institute, Delhi Centre

More information