Estimation of Binomial Distribution in the Light of Future Data

Similar documents
New Expansion and Infinite Series

Acceptance Sampling by Attributes

Non-Linear & Logistic Regression

Decision Science Letters

63. Representation of functions as power series Consider a power series. ( 1) n x 2n for all 1 < x < 1

A NOTE ON ESTIMATION OF THE GLOBAL INTENSITY OF A CYCLIC POISSON PROCESS IN THE PRESENCE OF LINEAR TREND

Lecture 3 Gaussian Probability Distribution

Research Article Moment Inequalities and Complete Moment Convergence

Continuous Random Variables

Numerical integration

The steps of the hypothesis test

The Shortest Confidence Interval for the Mean of a Normal Distribution

1 Probability Density Functions

Time Truncated Two Stage Group Sampling Plan For Various Distributions

Credibility Hypothesis Testing of Fuzzy Triangular Distributions

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:

AQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system

Predict Global Earth Temperature using Linier Regression

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Review of Calculus, cont d

Undergraduate Research

Realistic Method for Solving Fully Intuitionistic Fuzzy. Transportation Problems

Riemann Sums and Riemann Integrals

Monte Carlo method in solving numerical integration and differential equation

Chapter 5 : Continuous Random Variables

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction

Riemann Sums and Riemann Integrals

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.

Arithmetic Mean Derivative Based Midpoint Rule

Bernoulli Numbers Jeff Morton

THE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithaca, NY

P 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0)

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

S. S. Dragomir. 2, we have the inequality. b a

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Journal of Inequalities in Pure and Applied Mathematics

7.2 The Definite Integral

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

For the percentage of full time students at RCC the symbols would be:

On a Method to Compute the Determinant of a 4 4 Matrix

Construction and Selection of Single Sampling Quick Switching Variables System for given Control Limits Involving Minimum Sum of Risks

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

Tests for the Ratio of Two Poisson Rates

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d

Hybrid Group Acceptance Sampling Plan Based on Size Biased Lomax Model

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Lecture 21: Order statistics

The Algebra (al-jabr) of Matrices

3.4 Numerical integration

Read section 3.3, 3.4 Announcements:

Calculus 2: Integration. Differentiation. Integration

NOTE ON TRACES OF MATRIX PRODUCTS INVOLVING INVERSES OF POSITIVE DEFINITE ONES

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

Student Activity 3: Single Factor ANOVA

4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve

The Predom module. Predom calculates and plots isothermal 1-, 2- and 3-metal predominance area diagrams. Predom accesses only compound databases.

( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that

Penalized least squares smoothing of two-dimensional mortality tables with imposed smoothness

S. S. Dragomir. 1. Introduction. In [1], Guessab and Schmeisser have proved among others, the following companion of Ostrowski s inequality:

An approximation to the arithmetic-geometric mean. G.J.O. Jameson, Math. Gazette 98 (2014), 85 95

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

Reinforcement learning II

On the Generalized Weighted Quasi-Arithmetic Integral Mean 1

Using QM for Windows. Using QM for Windows. Using QM for Windows LEARNING OBJECTIVES. Solving Flair Furniture s LP Problem

Exponentials - Grade 10 [CAPS] *

Self-similarity and symmetries of Pascal s triangles and simplices mod p

5.7 Improper Integrals

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes

20 MATHEMATICS POLYNOMIALS

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

The Regulated and Riemann Integrals

We will see what is meant by standard form very shortly

N 0 completions on partial matrices

Integral points on the rational curve

Math Lecture 23

Chapter 1. Basic Concepts

MAC-solutions of the nonexistent solutions of mathematical physics

STEP FUNCTIONS, DELTA FUNCTIONS, AND THE VARIATION OF PARAMETERS FORMULA. 0 if t < 0, 1 if t > 0.

Czechoslovak Mathematical Journal, 55 (130) (2005), , Abbotsford. 1. Introduction

1 Online Learning and Regret Minimization

Chapter 0. What is the Lebesgue integral about?

Euler, Ioachimescu and the trapezium rule. G.J.O. Jameson (Math. Gazette 96 (2012), )

Research Article Harmonic Deformation of Planar Curves

Math 31S. Rumbos Fall Solutions to Assignment #16

The Modified Heinz s Inequality

Research Article On Hermite-Hadamard Type Inequalities for Functions Whose Second Derivatives Absolute Values Are s-convex

Math 1B, lecture 4: Error bounds for numerical methods

New data structures to reduce data size and search time

Section 6.1 INTRO to LAPLACE TRANSFORMS

A Convergence Theorem for the Improper Riemann Integral of Banach Space-valued Functions

8 Laplace s Method and Local Limit Theorems

THIELE CENTRE. Linear stochastic differential equations with anticipating initial conditions

Math 113 Fall Final Exam Review. 2. Applications of Integration Chapter 6 including sections and section 6.8

Chapter 1: Fundamentals

Transcription:

British Journl of Mthemtics & Computer Science 102: 1-7, 2015, Article no.bjmcs.19191 ISSN: 2231-0851 SCIENCEDOMAIN interntionl www.sciencedomin.org Estimtion of Binomil Distribution in the Light of Future Dt Kunio Tkezw 1 1 Agroinformtics Division, Agriculturl Reserch Center, Ntionl Agriculture nd Food Reserch Orgniztion Knnondi 3-1-1, Tsukub, Ibrki 305-8666, Jpn. Article Informtion DOI: 10.9734/BJMCS/2015/19191 Editors: 1 H. M. Srivstv, Deprtment of Mthemtics nd Sttistics, University of Victori, Cnd. Reviewers: 1 P. E. Oguntunde, Deprtment of Mthemtics, Covennt University, Nigeri. 2 Jos Alejndro Gonzlez Cmpos, Deprtment of Mthemtic nd Sttistic, Ply Anch University, Chile. Complete Peer review History: http://sciencedomin.org/review-history/10077 Short Reserch Article Received: 29 My 2015 Accepted: 15 June 2015 Published: 07 July 2015 Abstrct A predictive estimtor for estimting the prmeter of binomil distribution is suggested. This estimtor ims to mximize the expecttion of expected log-likelihood. The results given by this estimtor re superior to those given by the mximum likelihood estimtor in terms of the predictions when little prior knowledge bout the prmeter is vilble. Keywords: expected log-likelihood; binomil distribution; mximum likelihood estimtor; optimiztion 2010 Mthemtics Subject Clssifiction: 60G25; 62F10; 62M20 1 Introduction The mximum likelihood estimtor nd the unbised estimtor re common tools, which re used to estimte the prmeters of probbility density function from dt. However, the mximum likelihood estimtor is known to give uncceptble estimtes in some situtions e.g., pge 343 in [1]. The unbised estimtor does not exist under certin conditions nd, even if it exists, it my not be invrint pge 415 in [1]. In this pper, I propose predictive estimtor for estimting the prmeter of binomil distribution. The estimtor ims to mximize the expecttion of the expected log-likelihood, nd herefter is clled the predictive estimtor. Becuse nothing is known bout the form of the predictive estimtor, I ssume very simple form. Therefore, predictive estimtors with other forms my yield better results. *Corresponding uthor: E-mil: nonpr@gmil.com

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 2 Predictive Estimtor for Binomil Distribution The probbility density function of the binomil distribution is written s n fξ = p ξ 1 p n ξ ξ = 0, 1, 2,..., n. 2.1 ξ where p is the true vlue of the prmeter nd ξ is the vrible. The expecttion of Eq.2.1 cn be written s n ξfξ = n p. 2.2 ξ=0 Assuming tht X is rndom vrible tht obeys the probbility density function fξ nd its reliztion i.e., dt is clled x, then the log-likelihood lp x of the dt is given s lp x n = log + xlogp + n xlog1 p. 2.3 n x To derive the p tht mximizes the bove vlue, the vlue is differentited with respect to p nd set equl to 0 s follows: x p n x = 0. 2.4 1 p Therefore, the mximum likelihood estimtor ˆp is obtined from the dt t hnd x s ˆp = x n. 2.5 Next, future dt re nmed {x i } 1 i m; future dt comes from nother smpling thn tht for dt t hnd x. The log-likelihood lˆp {x i } of ˆp in the light of this dt is represented s lˆp {x i } = 1 m n log + 1 m x i logˆp + 1 m n x i log1 ˆp. 2.6 m m m m x i The ˆp tht mximizes this vlue is termed ˆp nd is depicted s m ˆp = x i mn. 2.7 Becuse the number of future dt is infinite, m is set to be infinite. In this sitution, the vlue of ˆp is clled ˆp nd Eq.2.2 leds to m ˆp = lim x i = p. 2.8 m mn Here, the mximum likelihood estimtor for n infinite number of future dt is the true vlue of the prmeter i.e., p. Substitution of Eq.2.8 into Eq.2.6 yields lˆp {x i } 1 m n lim = lim log + n plogˆp + n1 plog1 ˆp. 2.9 m m m m x i When the bove eqution is tken into ccount nd the log-likelihood of ˆp for n infinite number of future dt is nmed l ˆp, we hve l 1 m n ˆp = lim log + n plogˆp + n1 plog1 ˆp. 2.10 m m x i 2

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 Becuse the first term of the right-hnd side does not ffect derivtion of ˆp, this term cn be omitted nd the log-likelihood, nmed l ˆp, cn be written s l ˆp = n plogˆp + n1 plog1 ˆp. 2.11 The ˆp given by Eq.2.5 my not be the optiml estimtor in the light of future dt; therefore, the optiml estimtor for future dt i.e., the predictive estimtor, here clled ˆp +, cn be represented s ˆp + = x, 2.12 where is constnt. Hence, the log-likelihood of ˆp + for n infinite number of future dt is l ˆp + x = n plog + n1 plog 1 x. 2.13 By differentiting this eqution with respect to nd setting it equl to 0, we obtin = n. 2.14 x p The resulting eqution cn be written s which is n obvious reltionship. ˆp + = p, 2.15 Next, let us consider the verged vlue of l ˆp + i.e., expecttion of l ˆp +, which is given by smpling x infinite times. This expecttion is represented s ] [ E x l ˆp + ] x = E x [n plog = n 1 ξ=1 n plog ξ + n1 plog 1 x + n1 plog n 1 ξ=1 1 ξ n p r 1 p n ξ ξ. n p ξ ξ 1 p n ξ 2.16 The rnge of summtion of the right-hnd side of the second line of this eqution is from r = 1 through r = n 1 insted of r = 0 through r = n. This is becuse the possibility tht x tkes the vlue of 0 or n hs to be excluded. As result of this mesure, the expecttion is stndrdized by the constnt i.e., the denomintor. Moreover, if is set to be negtive, the rgument of log 1 x cn be negtive; hence, hs to be 0 or positive. It should be noted tht the type of expecttion used in Eq.2.16 is found in the derivtion of AIC Akike s informtion criterion. For exmple, E Gxn on pge 55 in [2] is similr expecttion. When = 0 is ssumed in Eq.2.16, it yields the expecttion of expected log-likelihood. Therefore, if the vlue of Eq.2.16 for 0 is lrger thn its vlue for = 0, n tht holds 0 will give better estimtor. Then, is defined s [ = E x l ˆp + ] [ E x l ˆp ] x = E x [n plog + n1 plog 1 x ] x E x [n plog + n1 plog 1 x ]. 2.17 n n 3

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 n= 5, p ~ =0.1 n= 5, p ~ =0.3 n= 5, p ~ =0.5 n= 5, p ~ =0.7 Figure 1: given by Eq.2.17 when n = 5. p = 0.1 top left; p = 0.3 top right; p = 0.5 bottom left; nd p = 0.7 bottom right. This is clculted using Eq.2.16. When is clculted ssuming n = 5, p = {0.1, 0.3, 0.5, 0.7}, nd = {0.1, 0.2, 0.3,..., 2.0} the results re shown in Figure1. When n = 10 is ssumed nd the other prmeters remin the sme s those used in Figure 1, the results re shown in Figure2. For exmple, if we know from the phenomenon generting the dt tht n = 10 nd 0.1 p 0.5, should be set t 0 to 1.5. In this sitution, when is set t 0, it gives the mximum likelihood estimtor. Hence, when = 0.5 or = 1 is used for exmple, the verged expecttion of expected log-likelihood yielded by repeted smplings is lrger thn the verged expecttion given by the mximum likelihood estimtor Figure 2, which indictes tht this predictive estimtor is more desirble estimtor. To optimize the vlue of exctly, detiled prior knowledge bout the prmeters or settings of the dt re required. Next, we consider the sitution when the probbility of success obeys the binomil distribution nd the probbility of filure lso obeys the binomil distribution. Assuming, p s is the probbility of success, p f is the probbility of filure, nd p s + p f = 1. If we know tht n = 10 nd 0.1 p s 0.5, then setting in 2.12 t 0.5 or 1 gives lrger vlue of the expecttion of expected log-likelihood thn is given by the mximum likelihood estimtor. By contrst, becuse we know 0.5 p f 0.9, setting of in Eq.2.12 t 0 i.e., use of the mximum likelihood estimtor is resonble choice, nd ˆp + s + ˆp + f 1. Conventionlly, we believe tht when ps + p f = 1 holds, ˆp s + ˆp f = 1 is stisfied. 4

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 0.4 n= 10, p ~ =0.1 n= 10, p ~ =0.3 n= 10, p ~ =0.5 n= 10, p ~ =0.7 Figure 2: given by Eq.2.17 when n = 10. p = 0.1 top left; p = 0.3 top right; p = 0.5 bottom left; nd p = 0.7 bottom right. This inference is not obviously vlid. When the purpose of the estimtion is mximiztion of the expecttion of expected log-likelihood, this reltionship is no longer stisfied. 3 Conclusions Byes estimtor [3-8] is conventionlly the most common tool for using prior knowledge to estimte prmeters of probbility density function. The predictive estimtor method proposed here uses prior knowledge such s 0.1 p s 0.5 to mke the expecttion of expected log-likelihood lrger thn tht given by the mximum likelihood estimtor. In the cse of the predictive estimtor for the exponentil distribution, the estimtor given by multiplying 1 1 n n is the number of dt with the mximum likelihood estimtor yields lrger vlue for the expected log-likelihood thn simply using the mximum likelihood estimtor [9]. Therefore, even if no prior informtion bout the true prmeter is vilble, predictive estimtor tht performs better thn the mximum likelihood estimtor is vilble. Conversely, our predictive estimtor of the binomil distribution requires some prior knowledge bout the prmeters. However, predictive estimtor tht does not need ny prior knowledge of the prmeters is still possible. Furthermore, if we develop method of constructing n optiml predictive estimtor depending on the prior knowledge on the prmeters, we expect tht such predictive estimtor will replce the mximum likelihood estimtor. The 5

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 sme is true for the estimtion of prmeters of distributions other thn the binomil distribution. The MSE men squred error is defined s below pge 330 in [1]. MSE = E θ W θ 2, 3.1 where θ is the true prmeter, W is the estimtor of the prmeter, nd E θ is the expecttion given by verging the results of repeted smplings. As noted on pge 332 of [1], MSE depends upon the true prmeter; hence, some prior knowledge bout the prmeters is needed to estimte the prmeters by minimizing MSE. In this regrd, minimizing MSE is similr to mximizing the expecttion of expected log-likelihood. Estimtion of vrince of dt by minimizing MSE, however, does not need such prior knowledge. This sitution should be hndled s n exception in the sme wy tht the exponentil distribution hs n exceptionl predictive estimtor nd the norml distribution hs third vrince [10-11]. Moreover, methodologies bsed on predictive estimtors hve something in common with smoothing splines e.g., sections 2 nd 3 of [12], nd section 3 of [13]. Currently, no definitive theory on the desirble form of the roughness penlty in smoothing splines is vilble; hence, we could not confirm tht the most commonly-used roughness penlty is the optiml one. Diverse theories nd numericl simultions, however, indicte tht our common roughness penlty is vlid from vrious perspectives. Further, we hve not confirmed tht the use of Eq.2.12 s predictive estimtor is pproprite or optiml. However, Fig.1 nd Fig.2 show tht when is set in vlid rnge, the predictive estimtor defined in Eq.2.12 outperforms the mximum likelihood estimtor. Therefore, lthough we cnnot deny the possibility tht better predictive estimtors exist, tenttively Eq.2.12 with n pproprite vlue of cn be used. Tht is, setting n pproprite vlue of corresponds to setting pproprite vlues for the smoothing prmeters in smoothing splines. Thus, the use of predictive estimtor such s Eq.2.12 cn be justified in the sme sense tht the use of smoothing splines is justified, even though it hs not been proven tht smoothing splines leds to better results thn those obtined by ll the other estimtion methods. The chrcteristics of the predictive estimtor should now be evluted for vrious estimtions; in prticulr, the predictive estimtor should be compred with the mximum likelihood estimtor nd unbised estimtor. Such studies will help estblish mjor techniques for using dt more efficiently. Acknowledgements The uthor is very grteful to the referees for crefully reding the pper nd for their comments nd suggestions which hve improved the pper. Competing Interests The uthor declres tht no competing interests exist. References [1] Csell G, Berger RL. Sttisticl inference. 2nd ed. Pcific Grove CA: Duxbury Press; 2001. [2] Konishi S, Kitgw G. Informtion criteri nd sttisticl modeling. New York: Springer; 2008. [3] Robert CP. The byesin choice: From decision-theoretic foundtions to computtionl implementtion. 2nd edition. Springer; 2007. 6

Tkezw; BJMCS, 102, 1-7, 2015; Article no.bjmcs.19191 [4] Crlin BP, Louis TA. Byesin methods for dt nlysis. Third Edition. Chpmn & Hll CRC; 2008. [5] O Hgn A, Forster J. Kendll s dvnced theory of sttistics. Byesin Inference, 1 edition. Wiley. 2010;2B. [6] Lee PM. Byesin sttistics: An introduction. Wiley; 2012. [7] Gelmn A, Crlin JB, Stern HS, Dunson DB, Vehtri A, Rubin DB. Byesin dt nlysis, third edition. Chpmn & Hll CRC; 2013. [8] Bessiere P, Mzer E, Ahuctzin JM, Mekhnch K. Byesin progrmming. Chpmn & Hll/CRC; 2013. [9] Tkezw K. Estimtion of the exponentil distribution in the light of future dt. British Journl of Mthemtics & Computer Science. 2015;51:128-132. [10] Tkezw K. A revision of ic for norml error models. Open Journl of Sttistics. 2012;23:309-312. [11] Tkezw K. Lerning regression nlysis by simultion. Springer; 2013. [12] Green PJ, Silvermn BW. Nonprmetric regression nd generlized liner models: A roughness penlty pproch. Chpmn & Hll CRC; 1993. [13] Tkezw K. Introduction to nonprmetric regression. Wiley; 2005. c 2015 Tkezw; This is n Open Access rticle distributed under the terms of the Cretive Commons Attribution License http://cretivecommons.org/licenses/by/4.0, which permits unrestricted use, distribution, nd reproduction in ny medium, provided the originl work is properly cited. Peer-review history: The peer review history for this pper cn be ccessed here Plese copy pste the totl link in your browser ddress br http://sciencedomin.org/review-history/10077 7