o <Xln <X2n <... <X n < o (1.1)

Similar documents
6. Sufficient, Complete, and Ancillary Statistics

Topic 9: Sampling Distributions of Estimators

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Lecture 2: Monte Carlo Simulation

Estimation for Complete Data

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Bayesian Methods: Introduction to Multi-parameter Models

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Properties and Hypothesis Testing

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Unbiased Estimation. February 7-12, 2008

Output Analysis and Run-Length Control

Exponential Families and Bayesian Inference

Stat 421-SP2012 Interval Estimation Section

Lecture 12: September 27

Lecture 7: Properties of Random Samples

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

10-701/ Machine Learning Mid-term Exam Solution

Element sampling: Part 2

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.


Statistical Inference Based on Extremum Estimators

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Topic 9: Sampling Distributions of Estimators

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Random Variables, Sampling and Estimation

Topic 9: Sampling Distributions of Estimators

Chapter 6 Principles of Data Reduction

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Machine Learning for Data Science (CS 4786)

Problem Set 4 Due Oct, 12

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

1.010 Uncertainty in Engineering Fall 2008

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Algebra of Least Squares

Expectation and Variance of a random variable

CSE 527, Additional notes on MLE & EM

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

4. Hypothesis testing (Hotelling s T 2 -statistic)

Lecture 33: Bootstrap

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Principle Of Superposition

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

The standard deviation of the mean

Lecture 6 Ecient estimators. Rao-Cramer bound.

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Hoggatt and King [lo] defined a complete sequence of natural numbers

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Efficient GMM LECTURE 12 GMM II

Machine Learning for Data Science (CS 4786)

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Empirical Process Theory and Oracle Inequalities

The Method of Least Squares. To understand least squares fitting of data.

A TYPE OF PRIMITIVE ALGEBRA*

The Poisson Process *

Statistical Theory MT 2008 Problems 1: Solution sketches

x = Pr ( X (n) βx ) =

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Basics of Probability Theory (for Theory of Computation courses)

Monte Carlo Integration

Chapter 7 Isoperimetric problem

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

Matrix Representation of Data in Experiment

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Asymptotic distribution of products of sums of independent random variables

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State

Binomial Distribution

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

A statistical method to determine sample size to estimate characteristic value of soil parameters

Statistical Theory MT 2009 Problems 1: Solution sketches

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Linear Regression Demystified

Lecture 18: Sampling distributions

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 23: Minimal sufficiency

1 Inferential Methods for Correlation and Regression Analysis

Lecture 11 and 12: Basic estimation theory

MATHEMATICAL SCIENCES PAPER-II

INEQUALITIES BJORN POONEN

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Simulation. Two Rule For Inverting A Distribution Function

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

4. Partial Sums and the Central Limit Theorem

CHAPTER I: Vector Spaces

Transcription:

Metrika, Volume 28, 1981, page 257-262. 9 Viea. Estimatio Problems for Rectagular Distributios (Or the Taxi Problem Revisited) By J.S. Rao, Sata Barbara I ) Abstract: The problem of estimatig the ukow upper boud 0 o the basis of a sample of size from a uiform or rectagular distributio o [0, 0] has cosiderable iterest. This or the aalogous discrete versio is variously kow as the "Taxi-problem" or the "Germa bomb (or Tak) problem" ad has a log history. The emphasis here is o estimatio of 0 through the legths of the observed gaps or spacigs which seem atural for this problem. 1. Itroductio Let X1... X be a radom sample from a uiform distributio o [0, 0]. Estimatio of the ukow upper boud 0 is if iterest, for istace, i coectio with estimatig the total umber of taxis i a tow o the basis of observed registratio umbers or i estimatig the umber of eemy bombs (or taks) o the basis of observed serial umbers, providig of course, some obvious assumptios hold. See, for istace, Noether [ 1971,2-5 ] for a elemetary discussio. A cotiuous uiform distributio will be assumed here, which provides a good approximatio to the results i the case of a discrete uiform o the itegers {1, 2... 0}. I fact, aalogous results may be obtaied for the latter case. Whe (X1... X) is a radom sample from R (0, 0), the rectagular (or uiform) distributio o (0, 0), the followig results are kow ad stated for completeess. Let o <Xl <X2 <... <X < o (1.1) deote the order statistics. The sample maximum X is a complete sufficiet statistic ad has the cumulative distributio fuctio (calf) FX (x) = (x/o), 0 <x < 0. (1.2) From (1.2), it is see thate o (X) = (/ + 1)0 ad hece T _ +,l 1 X (1.3) 1 ) j.s. Rao, Departmet of Mathematics, Uiversity of Califoria, Sata Barbara, Califoria 93 106, USA.

258 J.S. Rao is ubiased for 0. Sice this estimator is a fuctio of the complete sufficiet statistic, it follows from the Rao-Blackwell ad Lehma-Scheff6 theorems that T is the (essetially) uique uiformly miimum variace ubiased estimate (umvue) of 0 [see, for istace, David, p. 96]. Sample spacigs or observed gaps come aturally ito play i this problem sice X falls short of 0 by a amout equal to the last gap. Now we itroduce some basic facts about spacigs. Spacigs are defied to be the gaps betwee successive observatios, i.e. Di =)(i --Xi.1,, i= 1,2,..., (1.4) where we put Xo =-- O. Sice is held fixed i all subsequet discussios, we shall drop the secod subscript i )(i' Di' etc. to simplify the otatio. If oe defies U i=x i/o ad T i=d i/o, i=1... (1.5) the (Ux... U) has the same distributio as a radom sample from ar(0, 1) distributio while (T1... T) correspod to the "uiform spacigs." These (T.) form a exchageable set of radom variables with a joit Dirichlet distributio. Recall that a k-dimesioal radom vector (Yx... Yk) has a Dirichlet distributio deoted by D(rl... rk; rk+l) if it has the joit desity P(rl +...+rk+ 1) k r.-1 yk)rk+l-1 f(yl,. "" 'Yk) -- k+l (i~=xyi' )(I--y1--...-- (1.6) II I ~ (ri) i=1 overthesimplexsk=o~: yi>~o, k Zl Yi<-l)i Rk" SeeWilks[1962,177-182]fora excellet discussio of the basic facts about this distributio. I particular, (TI... T) has a -variate Dirichlet D(1,..., 1; 1) with all the parameter values uity, i.e., with desity f ( tl... t) =! (1.7) over the simplex S = {t: t i >~ 0, 1Z t i ~< 1 ) i R [see, for istace, David, 79-80]. From (1.7) it follows that ay T i has a D(1 ; ) or Beta (1, ) distributio. From this ad the fact D i ad Ti/O have the same distributio, it ca be verified that E(Di) = O/( + 1) V(Di) = 02/( + 1)2( + 2) (1.8) Coy (Di,/9/.) =--02/( + 1)2( + 2) for i v~/. It may be oted i passig that Dirichlet radom variables have a additive property

Estimatio Problems for Rectagular Distributios 259 l amely that for l ~< k, ( Y~ Y.) has ad(rl +... + rl; rl+ 1 +... + rk+l) which is a i=1 Beta distributio. From this, the samplig distributios of the uiform order statistics r Ur i? 1 T/ad the sample rage U Ul = ~'l T/ca be writte dow immediate- ly as the Beta (r; + 1 -- r) ad Beta ( -- 1 ; 2) respectively. 2. Estimatio of 0 Estimatio of parameters through the use of a few or all of the order statistics has several advatages, pricipally their simplicity. See, for istace, Mosteller [ 1946] or David [1970, chapters 6 ad 7]. They are especially useful i situatios where trimmig ad cesorig of the observatios is part of the model ad yield drastic reductio i labor over the optimal methods which ca be sometimes laborious. We suggest here estimatio through spacigs, liear combiatios i which are equivalet to liear fuctios of order statistics. For a discussio of liear estimatio through order statistics, refer to David [ 1970, p. 102]. As poited out earlier, the sample maximum falls short of 0 by a amout equal to the last gap. Sice the gaps are exchageable, addig the legth of ay of the gaps or the average legth of ay set of gaps or merely multiplyig ay gap by ( + 1) yields ubiased estimators of 0. Thus, for r = 1... Tlr = X + D r = 2D r+ ~ D., i~r t (2.1) T2r=X+--EDi = ~D. 1+ + s r i=1 i=1 t r+l t ad T3r = ( + 1) D r are all ubiased estimators of 0. From (1.8), oe ca verify Var(Tlr ) = 402/( + 1)2( + 2) Var(T2r)=O2(l + l)/(+ l)(+ 2) (2.2) Var(T3r ) = 02/( + 2). Because of symmetry, the variace expressios for {Tlr) ad {T3r} do ot deped o the specific D r that is used while the V(T2r ) decreases with r ad is a miimum for r = for which x(1) T2 =X + = 1 + X=T (2.3) 17

260 J.S. Rao defied i (1.3). Also recall that if X = 1E Xi/ deotes the sample mea, the 2)~ =2 i=l~(--i + 1) Di (2.4) provides yet aother ubiased estimator of 0. Thus oe may cosider a liear combiatio of spacigs w =t l hi" ~ (2.5) to estimate 0. Ubiasedess of W implies the coditio Z bi = ( + I) (2.6) 1 which is, of course, the case with all the estimators i (2.1) ad (2.4). It is ow atural to ask for the best liear ubiased estimate of 0 from amog the class (2.5). A elemetary calculatio usig (1.8) shows that the variace of W is miimized subject to (2.6) whe bi = (( + 1/) for i = 1..., with the resultig estimator (2.3). The equal weights are to be expected o all the spacigs from symmetry cosideratios. Sice (2.3) is the umvue ad is also of the from (2.5), it is o surprise that it is the best liear ubiased estimate. Ideed, equatio (1.8) shows that the vector D = = (D1... D)' follows a liear model with expectatio ( + 1) -101 where i is the colum vector with all oes, ad covariace matrix Q = [( + 1) I - - 1 I] (02/( + 1) 2 ( + 2)). Usig the fact that the iverse of [( + 1)/ -I l'] is ( + 1) -1 [1 + ll'], the formal Gauss-Markov least squares estimator (i its slightly geeralized versio sice the covariace matrix,is ot diagoal) is give by = -2,Q-I ]-i [(+l) 1 1 [(+l) "1 1 QD] (2.7) \ / which is agai the statistic i (2.3) with equal weights bi = ( + 1)/. Alterately oe ca approach the problem of ~timatig 0 with the goal of miimizig the mea square error,(mse) where MSE(0) = E(O -- 0) 2 ad relax the coditio (2.6) that the estimator 0 be ubiased. If we use equal weights, say b, o all {Di}, the the problem is to fid the weight b for which the estimator N bd i = bx has the smallest MSE. It is easy to verify that i= 1 [ b2 2b ] MSE(bX) =E(bX -0)2 = 02 ( + 2) ( + 1) I- 1 (2.8)

Estimatio Problems for Rectagular Distributios 261 which is miimized whe b = ( + 2)/( + 1). Thus the estimator (+2/x r4 =\ ~-il ] (2.9) has the smallest MSE. It is iterestig to compare this with the other competitors amely the umvue T2 i (2.3) ad the maximum lieklihood estimator X. Takig b to be (( + 2)/( + 1)), (( + 1)/) ad 1 respectively i (2.8), we get MSE(T4 ) = OZ/( + 1) 2 MSE(T2 ) = 02/( + 2) (2.10) MSE(X ) = 202/( + 1) ( + 2) from which it follows that with respect to the MSE criterio, T4 give i equatio (2.9) is uiformly better tha the umvue T2 give i (2.3) which i tur is uiformly better tha the maximum likelihood estimator X. This icidetally is aother istace of a situatio where the umvue is ot admissible uder the quadratic loss fuctio. Aother iterestig way to improve the estimators give i (2.1) with respect to their MSE's is give by the followig procedure: Sice the coefficiet of variatio v (i.e., Var(0)/02) is idepedet of 0 (cf. equatio (2.2)), 0* = (1 + v)-a0 yields aother estimator of 0 with MSE(0*) = Var(0*) + [Bias(0*)] 2 V2 v q_ =0 =02 (1 +v) 2 (1 +v) 2 2(V) which is uiformly smaller tha the MSE of the origial estimator 0. Thus each of the ubiased estimators i (2.1) may be improved with respect to the MSE. This yields the estimators ad = ( + 1) 2( + 2) T~lr ( + l)2( + 2) +4 (A +Dr) = r(+ 1)(+2) T~r r(+ l)(+ 2)+(r+ l)t2r T~ r = + 2Dr 2 (2.11) which have smaller MSE's tha the correspodig ubiased estimators give i (2.1). While the MSE of T~r ad T~' r does ot deped o r, the MSE of T~2 r does deped o r ad is a miimum for r =. It is very iterestig to ote that the resultig T~ is ideed (( + 2)/( + 1)) A, the estimator with miimum MSE that we obtaied i (2.9).

262 J.S. Rao But the real advatage of usig spacigs i estimatio of 0 comes i situatios of cesorig where some of the order statistics at either ed or i the middle are missig. The best liear ubiased estimate based o the spacigs would the be to put equal weights o the available or observed gaps. I particular, if the sample is cesored so that oe observes oly the m-th largest order statistic Xm (for m ~< ), the the followig are all ubiased estimators of 0 T~r = Xm + (( + 1)-m)D r r T~r=Xm + +l--m Z D i (2.12) r i=1 T~r = ( + 1)D r for r = 1,..., m. By a aalysis similar to that used before, it may be show, that the best liear ubiased estimate of 0 is to take T~m= ~ (+----~lld=+l i=1 m ] i m Xm (2.13) with variace V(T~m ) = ( --m + 1)02/( + 2)m. (2.14) Thus spacigs seem to be the atural quatities to cosider i the estimatio of 0. Sarha/Greeberg [ 1959] discuss the problem of cesorig at both eds i rectagular populatios usig order statistics. This alterate approach based o spacigs yields the same results, more effortlessly. The author is very grateful to the referee for his may helpful commets ad suggestios. Refereces David, H.A. : Order Statistics. New York 1970. Mosteller, F. : O some useful "iefficiet" statistics. A. Math. Statist. 17, 1946, 377-408. Noether, G. : Itroductio to statistics - a fresh approach. Bosto 1971. Sarha, A.E., ad G.B. Greeberg: Estimatio of locatio ad scale parameters for the rectagular populatios from cesored samples. J.R. Statist. Soc. B21, 1959, 356-363. Wilks, S.S. : Mathematical Statistics. New York 1962. Received April 30, 1979 (revised versio December 1979)