The exact confidence limits for unknown probability in Bernoulli models

Similar documents
[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Topic 9: Sampling Distributions of Estimators

IJITE Vol.2 Issue-11, (November 2014) ISSN: Impact Factor

ON POINTWISE BINOMIAL APPROXIMATION

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1


Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Lecture 5. Random variable and distribution of probability

Fundamental Concepts: Surfaces and Curves

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Properties and Hypothesis Testing

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Stat 319 Theory of Statistics (2) Exercises

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Gamma Distribution and Gamma Approximation

CH5. Discrete Probability Distributions

Lecture 7: Properties of Random Samples

Implicit function theorem

A statistical method to determine sample size to estimate characteristic value of soil parameters

Statistical Signal Processing

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Expected Number of Level Crossings of Legendre Polynomials

Disjoint Systems. Abstract

Element sampling: Part 2

Topic 9: Sampling Distributions of Estimators

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

(7 One- and Two-Sample Estimation Problem )

Castiel, Supernatural, Season 6, Episode 18

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

On Edge Regular Fuzzy Line Graphs

Singular Continuous Measures by Michael Pejic 5/14/10

Sampling Distributions, Z-Tests, Power

Advanced Stochastic Processes.

5. Likelihood Ratio Tests

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Lecture 4. Random variable and distribution of probability

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

AMS570 Lecture Notes #2

Math 61CM - Solutions to homework 3

Estimation for Complete Data

Expectation and Variance of a random variable

Discrete probability distributions

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION

Chapter 4. Fourier Series

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

CONTENTS. Course Goals. Course Materials Lecture Notes:

Random Variables, Sampling and Estimation

Chapter 6 Principles of Data Reduction

1 Inferential Methods for Correlation and Regression Analysis

Distribution of Random Samples & Limit theorems

The standard deviation of the mean

4. Basic probability theory

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Largest families without an r-fork

RAINFALL PREDICTION BY WAVELET DECOMPOSITION

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

A NEW METHOD FOR CONSTRUCTING APPROXIMATE CONFIDENCE INTERVALS FOR M-ESTU1ATES. Dennis D. Boos

CS284A: Representations and Algorithms in Molecular Biology

Axioms of Measure Theory

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

(6) Fundamental Sampling Distribution and Data Discription

Some Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Simulation. Two Rule For Inverting A Distribution Function

Quick Review of Probability

Basics of Probability Theory (for Theory of Computation courses)

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Quick Review of Probability

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

NOTES ON DISTRIBUTIONS

On Random Line Segments in the Unit Square

Statistics 511 Additional Materials

Journal of Mathematical Analysis and Applications 250, doi: jmaa , available online at http:

John Riley 30 August 2016

STA 4032 Final Exam Formula Sheet

Lecture 9: September 19

Lecture 15: Learning Theory: Concentration Inequalities

MATH/STAT 352: Lecture 15

Common Coupled Fixed Point of Mappings Satisfying Rational Inequalities in Ordered Complex Valued Generalized Metric Spaces

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises...

Properties of Fuzzy Length on Fuzzy Set

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Department of Mathematics

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

A New Mixed Randomized Response Model

DANIELL AND RIEMANN INTEGRABILITY

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Transcription:

The eact cofidece limits for uow probabilit i Beroulli models RI Adrushiw Departmet of Mathematical Scieces ad Ceter for Applied Mathematics ad Statistics New Jerse Istitute of Techolog Newar NJ 7 DA Klushi YuI Petui Departmet of Cberetics Kiv Natioal Taras Shevcheo Uiversit Kiv Uraie MYu Savia Istitute of Mathematics Natioal Academ of Scieces of Uraie Kiv Uraie CAMS Report 45-5 Sprig 5 Ceter for Applied Mathematics ad Statistics

THE EXACT CONFIDENCE LIMITS FOR UNKNOWN PROBABILITY IN BERNOULLI MODELS RIAdrushiw Departmet of Mathematical Scieces ad Ceter for Applied Mathematics ad Statistics New Jerse Istitute of Techolog Newar NJ USA DAKlushi YuIPetui Departmet of Cberetics Kiv Natioal Taras Shevcheo Uiversit KivUraie MYu Savia Istitute of Mathematics Natioal Academ of Scieces of Uraie Kiv Uraie Abstract The applicatio of mathematical-statistical models i medical diagostics ofte requires the costructio of a "eact" cofidece iterval for the uow probabilit p of success i Beroulli models (so called biomial proportio or proportio of populatio) This problem was cosidered i a umber of papers (for eample see [-5] ad refereces cited there) The website BioMed Cetral gives more tha citatios devoted to this theme The purpose of our paper is to costruct a "eact" cofidece iterval for uow probabilit p of success i classical ad geeralized Beroulli models Kewords Probabilit eact cofidece iterval Beroulli models The Settig Cosider the followig test of homogeeit for two populatios Let G ad G be geeral populatios with uow cotiuous distributio fuctios F ( u ) ad F ( u ) respectivel Let = ( ) be a sample from G ad = m be a sample from G We wat to test whether the uow distributio fuctios F ( u ) ad F ( u ) are the same (hpothesis H ) or ot (hpothesis H ) If the hpothesis H is true we have a homogeeous composite sample m otherwise the composite sample is heterogeeous For this purpose itroduce the variatioal series () () ( ) where () = ad ( + ) = ad cosider a radom iterval Iiq = ( ( i) ( i + q) ) with i ad q fied umbers i q i+ The scheme of trials is formulated i the followig wa: at the th step ( = m) we test whether the sample value belogs to the iterval I iq ad obtai a set of evets A = { Ii q} = m where ever evet ca occur with certai probabilit p = P( A) = m Let us itroduce a radom variable κ that is equal to the umber of evets A arisig i m trials If the hpothesis H is true the all probabilities p are the same ad equal to q pq = P( A H ) = () + This scheme is called the geeralized Beroulli model I paper [67] the distributio of probabilities of radom variable κ was determied: l m l Cl+ q Cm+ l q P( κ = l H ) = m C () m+ l = m where m are sizes of samples ad respectivel q is a fied umber which is equal to the umber of the order statistics i the iterval I iq s C r is the umber of combiatios of r elemets tae s at a time The purpose of our paper is to costruct the eact cofidece iterval I ( κ ) cotaiig the probabilit () o the basis of the value of the radom variable κ The word eact meas that the sigificace level of this cofidece iterval does ot eceed a give umber β (as a rule β = 5) With the help of this iterval it is possible to propose the followig test of hpotheses H ad H :

) O the basis of sample costruct the variatioal series ad tae a radom iterval I where i ad q fied umbers iq ) Calculate the statistics κ which is equal to the umber of the elemets of the sample which fall ito the iterval I iq ) O the basis of the statistics κ costruct the cofidece iterval I ( κ ) with the sigificace level β 4) If the iterval I ( κ ) does ot cover the probabilit p q the hpothesis H is rejected otherwise this hpothesis is ot rejected Costructio of Eact Cofidece Iterval Let us costruct the iterval I ( κ ) Let be a arbitrar iteger discrete radom variable with the distributio p = p( = ) = This radom variable geerate the fuctio ϕ ( ) defied o the set M = { } b the formula ϕ = p Cosider a arbitrar segmet I = { : } M We call the segmet I a domai of mootoicit of fuctio ϕ ( ) if the coditio i ( i I) implies that ϕ ϕ ( i) The iteger radom variable is called uimodal if its rage M ca be represeted as a uio of oe or two domais of mootoicit of the fuctio ϕ ( ) For eample a radom variable with biomial distributio ad a radom variable with distributio () are uimodal Remar The cocept of uimodal iteger discrete radom variable whe > differs from the cocept of uimodal radom variable proposed b AKhichi Ideed b Khichi a distributio F u of the uimodal radom variable fuctio is cove o the ra ( ) where is a mode of ad is cocave o the ra ( ) Therefore ( u ) ca have o more tha oe brea poit but a F discrete radom variable has a staircase fuctio hece if > the umber of brea poits is more tha two I additio to the iteger discrete radom variable with the distributio p = p( = ) = let us cosider a cotiuous radom variable with the followig desit of probabilities if u f ( u) = p if u + = if u + We shall call this fuctio iducig cotiuous radom variable The radom variable iduces the radom variable with the help of fuctio = = It mappig ever value R to its iteger part = It Equalit i the above formula has the followig implicatio Deote b E a radom trial producig a radom variable with give fuctio of probabilit p ( ) ad deote b E z a idepedet radom trial producig the radom variable z uiforml distributed o [ ] I a compoud radom trial E = E E the radom variable = + z has ( z) the desit fuctio f ( u ) ad the radom variable It ( ) taes the value if ad ol if = as a result of the trial E z Therefore we ca cosider that the compoud radom trial E produces the radom values ad such that ot ol the distributios of ad It ( ) are the same but also the values ad It ( ) are the same Let be a radom variable with biomial distributio ad be a cotiuous radom variable iducig I a Beroulli model the mathematical epectatio m( ) ad variace σ are as follows: + + m = uf u du = uf u du = + = p udu = + p = ( ) = = = p ( ) + p( ) = p + ; = = =

+ = m = p u du = = p + + ( ) = p p p = = = = + + = = m( ) + m + = = ( σ + ( m ) ) + = = pq + ( p) + p + σ = pq+ σ = pq + I the geeralized Beroulli model we have m = mp q ( + + ) m m σ = pq( pq) + Therefore i the geeralized Beroulli model m = mp q + ( m+ + ) m σ = pq( pq) + + Cosider a arbitrar fied cofidece iterval ab cotaiig the bul of G with sigificace level α Sice It ( ) is a o-decreasig fuctio it follows that the radom evet A = [ a b] = a b implies the radom { } { } A = { = It It a It b } evet Therefore the sigificace level of the closed cofidece iterval It ( a) It ( b) for the bul of G does ot eceed α Moreover It ( a) It ( b) [ a b] ad hece p ( It ( a) It ( b) ) p ( [ a b] ) Therefore the sigificace level of the cofidece iterval [ a b] for the bul of G also does ot eceed α: { [ ]} p a b α It is eas to see that the iteger discrete radom variable is uimodal if ad ol if iducig cotiuous radom variable is uimodal i the sese of Khichi For such radom variables the Gauss-Vsochasij-Petui iequalit holds [8] 4 p( m λσ ) 9 λ 8 where λ> Therefore the sigificace level of the cofidece iterval m λσ m +λσ 4 coverig the bul of G does ot eceed α= 9 λ 4 I particular whe λ = we have α= < 5 8 I the case of the classical Beroulli model put a = m λσ = p + λ pq + b = m +λσ = p + +λ pq + O the basis of the previous reasoig we have that the cofidece iterval [ a b] covers the bul of the radom variable with biomial distributio ie the cofidece iterval I = p λ pq + p +λ pq + has the sigificace level which does ot eceed 4 8 α= whe λ> 9 λ The radom evet { I} ca be rewritte i the followig form: p +λ pq + Thus i the Beroulli model p h p λ pq + + α To costruct the cofidece iterval for the uow probabilit p o the basis of the proportio h i the Beroulli model cosistig of

trials cosider two fuctios depedig o p : [ ] ϕ ( p) = h p ad ( p ) λ ψ = + ( ) p p + Let ψ ( p) = p ( p) + p R I eas to see that the graph of the fuctio ψ ( p) p R is the upper half of the ellipse E passig through the poits A= + + B = + 4 C = + D = + 4 with the ceter at the poit The graph of ψ ( p) is costructed o the basis of restrictio of the graph of ψ ( p) to the segmet [ ] b λ stretchig or compressig its graph b a factor ad shiftig b Therefore the graph of the fuctio ψ ( p) which does ot deped o h is a arc of a ellipse ψ Γ passig through the poits ( ) ψ ( ψ () ) such that the fuctio ψ ( p) achieves its miimum at the poit p = ad is smmetrical with respect to that poit The lower cofidece limit p is a root of the quadratic equatio λ λ + p + h () h λ + h + = 4 λ If h >ψ = + the the lower cofidece limit p is the least root of () If h ψ the p = Similarl the upper cofidece limit p is a root of the equatio λ λ + p + + h (4) h λ + h + + = 4 h >ψ the the upper cofidece limit If p is the largest root of (4) If h ψ the p = Remar Note that p h p so that the proportio of successes alwas lies i the cofidece iterval [ p p ] For the geeralized Beroulli model a similar reasoig gives the followig quadratic equatio for the lower cofidece limit: ( m+ + ) λ + p ( ) m + + ( m+ + ) λ + h (5) m ( ) m + h λ + h + = m 4m λ If h > + =γ the the lower m m cofidece limit p for the geeralized Beroulli model is the least root of (5) If h γ the p = Similarl the upper cofidece limit p for the geeralized Beroulli model is the root of the quadratic equatio

( m+ + ) λ ( + ) m ( m ) ( + ) + p + + λ + + h (6) m m h λ + h + + = m 4m If h >γ the the upper cofidece limit p is the largest root of (6) If h γ the p = B virtue of the previous results the sigificace level of the cofidece iterval does ot eceed 4 (i particular 5 for λ= ) 9 λ [8] Vsochasij DF Petui YuI Justificatio of the σ rule for uimodal distributio Theor Probab ad Math Stat 989; : 5-6 Refereces [] Petui Yu I Klushi DA Adrushiw RI Gaia KP Boroda NV Computer- Aided Differetial Diagosis of Breast Cacer ad Fibroadeomatosis based o Maligac Associated Chages i Buccal Epithelium Automedica ; 9(-4): 5-64 [] Brow LD Cai TT DasGupta A Iterval Estimatio for a Biomial Proportio Stat Sci ; (6): - [] Petui Yu I Klushi DA Adrushiw RI Gaia KP Boroda NV Aalsis of Maligac-Associated DNA Chages i the Nuclei of Buccal Epithelium i the Patholog of the Throid ad Mammar Glads Aals of the New Yor Academ of Scieces ; 98: - [4] [4] Yoo S David H Revisitig Clopper- Pirso Techical Report -5 Departmet of Statistics ad Statistical Laborator Iowa Uiversit; [5] Adrushiw RI Klushi DA Petui Yu I Lsu V Boroda NV Diagosis of Breast Cacer b the Modified Nearest Neighbor Recogitio Method I: F Valafar editor Proceedigs of the Iteratioal Coferece o Mathematics ad Egieerig Techiques i Medicie ad Biological Scieces; Ju 4-7; Las Vegas Nevada USA; p 76-89 [6] Matvechu SA Petui YuI Geeralized Beroulli schemes i variace statistics Part I Ur Mat Joural 99; 4(4): 58-58 [7] Matvechu SA Petui YuI Geeralized Beroulli schemes i variace statistics Part II Ur Mat Joural 99; 4(6): 779-785