A NEW METHOD FOR CONSTRUCTING APPROXIMATE CONFIDENCE INTERVALS FOR M-ESTU1ATES. Dennis D. Boos

Similar documents
Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Lecture 33: Bootstrap

Chapter 6 Sampling Distributions

Lecture 19: Convergence

1 Introduction to reducing variance in Monte Carlo simulations

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Topic 9: Sampling Distributions of Estimators

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Simulation. Two Rule For Inverting A Distribution Function

A new distribution-free quantile estimator

Expectation and Variance of a random variable

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

(6) Fundamental Sampling Distribution and Data Discription

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Stat 421-SP2012 Interval Estimation Section

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

5. Likelihood Ratio Tests

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Estimation for Complete Data

Topic 9: Sampling Distributions of Estimators

A statistical method to determine sample size to estimate characteristic value of soil parameters

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Department of Mathematics

Statistics 511 Additional Materials

(7 One- and Two-Sample Estimation Problem )

Statisticians use the word population to refer the total number of (potential) observations under consideration

Random Variables, Sampling and Estimation

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Topic 9: Sampling Distributions of Estimators

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Binomial Distribution

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Parameter, Statistic and Random Samples

Estimation of a population proportion March 23,


17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Lecture 7: Properties of Random Samples

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Math 113 Exam 3 Practice

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Probability and statistics: basic terms

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

1 Inferential Methods for Correlation and Regression Analysis

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Chapter 8: Estimating with Confidence

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

Some Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation

THE DATA-BASED CHOICE OF BANDWIDTH FOR KERNEL QUANTILE ESTIMATOR OF VAR

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

4. Partial Sums and the Central Limit Theorem

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

MATH/STAT 352: Lecture 15

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Module 1 Fundamentals in statistics

Last Lecture. Wald Test

Chapter 6 Principles of Data Reduction

Ma 530 Introduction to Power Series

The standard deviation of the mean

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Summary. Recap ... Last Lecture. Summary. Theorem

Economics Spring 2015

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES*

Statistical Intervals for a Single Sample

Properties and Hypothesis Testing

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung

Chapter 2 The Monte Carlo Method

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

Output Analysis and Run-Length Control

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

4.5 Multiple Imputation

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

There is no straightforward approach for choosing the warmup period l.

STAC51: Categorical data Analysis

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

ON BARTLETT CORRECTABILITY OF EMPIRICAL LIKELIHOOD IN GENERALIZED POWER DIVERGENCE FAMILY. Lorenzo Camponovo and Taisuke Otsu.

Access to the published version may require journal subscription. Published with permission from: Elsevier.

( µ /σ)ζ/(ζ+1) µ /σ ( µ /σ)ζ/(ζ 1)

Frequentist Inference

A proposed discrete distribution for the statistical modeling of

Orthogonal Gaussian Filters for Signal Processing

Element sampling: Part 2

M-Estimators in Regression Models

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

11 Correlation and Regression

Stochastic Simulation

Confidence Intervals for the Population Proportion p

Transcription:

.- A NEW METHOD FOR CONSTRUCTING APPROXIMATE CONFIDENCE INTERVALS FOR M-ESTU1ATES by Deis D. Boos Departmet of Statistics North Carolia State Uiversity Istitute of Statistics Mimeo Series #1198 September, 1978

e A New Method for Costructig Approximate Cofidece Itervals for M-estimates by Deis D. Boos The empirical fuctio used to defie M-estimates of locatio is similar to a distributio fuctio whe ~ is odecreasig. This similarity allows approximate cofidece itervals to be costructed from the "percetiles" of the defiig fuctio. KEY WORDS: M-estimates; Cofidece itervals; Quatiles; t statistic. "e 1

1. INTRODUCTION Let Xl,,X be a sample from a distributio F ad defie the locatio "parameter" 6 to be the solutio of 00! ~(x-6)df(x) = 0. _00 (1.1) A M-estimate for 6 is the solutio e of the empirical aalogue to (1.1) 1 [~(x.-e) = 0. i=l ~ (1. 2) '-e Asymptotic properties of A 6 are well-kow ad the Priceto study Adrews, et al (1972) suggests that r ( -&) approaches ormality fairly quickly. Huber (1970), Gross (1976, 1977), ad Shorack (1976) have co- structed approximate cofidece itervals for 6 based o studetizatio of r (8-6) by estimates of the asymptotic stadard deviatio. I this paper a ew method of costructig approximate cofidece itervals for 6 is proposed for the special class of mootoe odecreasig, right cotiuous ~ fuctios. The method exploits the fact that A (c) F = --~~(Xi-c) is like a distributio fuctio (Le., odecreasig ad right cotiuous). I particular, the edpoits of the proposed cofidece iterval are "percetiles" of A F. Sectio 2 gives a motivatig example ad Sectio 3 provides the basic ideas ad method. I Sectio 4 Mote Carlo results ad comparisos with other results are metioed. Sectio 5 shows how to exted to the regressio situatio ad Sectio 6 is a short summary. 2-

e 2. MOTIVATION FROM QUANTILE ESTIMATION Let quatile -1 F (p) = if{x:f(x) ~ p}. Cosider estimatio of the pth -1 F (p),o<p<l, from a sample havig distributio F. If F is the usual empirical df, the = P(a<F (F- 1 (p»<b). - -1 usig the fact that all df's G satisfy G (t).::.x iff t<g(x). For idepedet X. ~ -1 the statistic F (F (p» is biomial -1 (,F(F (p»). Thus the ormal approximatio to the biomial ad the assumptio -1 F(F (p» '" p lead us to choose '-e for a I p(l-p) -a = b = p + ~ - za/2 approximate (l-a) cofidece iterval for -1 F (p) (z a is the 100 (l-a)th percetile of the stadard ormal.) Although exact oparametric procedures exist for iid samples from cotiuous distributios, the above method geeralizes to quatile estimatio i more complicated situatios, e.g., stratified samplig from fiite populatios. The importat poit for the preset discussio is that M-estimatio ca use the same idea with F replaced by A F. 3. APPROXIMATE CONFIDENCE INTERVALS FOR e Let ~(t) be odecreasig, right cotiuous, ad strictly positive (egative) for large positive (egative) values of t. Two families of such ~ are "Hubers" ~(x) = max(-k, mi(k,x» ad "vth power" ~(x) = IxIVsg(x),0<v.::.1. For df's G defie 3

e 00 AG(C) = - f ~(x-c)dg(x) _00 -oo<c<oo ad t s (if A G (x), -oo<x<oo sup AG(X)). -oo<x<oo The parameter ad estimate are defied by Similar to the case of df's it follows that ad thus 8 = A;l(O) ad -l- AG (t)<x p(af-l(a) < 8 < A;l(b)) P(a ~ A F (8) < b) where is a reasoable estimate of The statistic of iterest, r A F (8) T = ~,--_ o I 1 2: ~(x.-e) i=l 1. has a form very close to a t statistic based o the rv's ~(X.-8). 1. the X. are symmetric about 8 ad ~(x) = -~(-x), the we expect T 1. to be close to a t distributio with -l degrees of freedom. Choosig -I alo = t a/2 = I b/o ' our proposed approximate (l-a) cofidece iterval is If 4

e (3.1) Uder suitable regularity coditios, the asymptotic width of (3.1) is comparable to methods based o studetizig ~ (6-8), i.e., It is ofte desirable that locatio estimates satisfy ""- For M-estimates the usual procedure is to replace ~(x) where ~ is a suitable scale estimate, or solve simultaeous equatios as i Huber's Proposal 2. The above methods carry through exactly ad the aalogous statistic of iterest is 1:. ~ ~(Xi-8 \ i=l & I! I -l 2 x.- 8 l: ~._1._ i=l \ A cr (3.2) 4. COMPARISONS AND MONTE CARLO RESULTS For small samples the form (3.2) is more appealig tha ~ (6-8)/8, where A S is a estimate of the asymptotic stadard deviatio 5

e ~ar IC(X 1 )]\ for the followig reaso. Although 6-e' is approximated by - 1 EIC(X.), Boos (1977) shows that this approximatio is at best 0 (3. ~ p sice ~-6--1EIC(Xi8 has a limit distributio. Thus, proximity of I(e-6)/S to a t distributio depeds o the t-1ike statistic -~ "- 2C (X.) /S ~ ad the approximatio of "- 6-6 by I fact Gross (1976) prefers to avoid use of the t distributio. O the other had Shorack (1976) seems to get very good t approximatios for certai Hampe1s. I order to spot cheek the performace of the approximate cofidece itervals based o (3.2), a small Mote Carlo study was performed. I Table 1 is foud the empirical error probabilities ad I times the expected cofidece iterval legths (ECIL) for 10,000 Mote Carlo "samples" --e geerated by the M~Gi11 "Super-Duper" radom umber geerator. A differet set of 10,000 samples was used for each distributio - ormal, logistic, D-EXP = double expoetial, T3 = t distributio with 3 degrees of freedom, slash a stadard ormal deviate divided by a idepedet uiform (0,1) deviate, ad for each sample size, = 10 ad = 20. Oly crude Mote Carlo techiques were used, so cosiderable error may exist i the 3rd decimal of the empirical probabilities ad i the 2d decimal of the ECIL. This is exemplified by the mea whose exact error probability we kow to be.05 for the ormal. SQRT is the M-estimator based o ~ *(x) = Ixl sg(x~ ad Hk = 1.0, 1.5 are Hubers with k = 1.0, 1.5 usig a ormalized iterquarti1e rage as a estimate of scale. Hk* = 1.5 is Huber's Proposal 2 with k = 1.5. For both = 10 ad = 20 the true levels are geerally coservative, but Hk = 1.5 ad Hk* = 1.5 are fairly close to.05 except for the slash distributio ad each has reasoably short ECIL. It is mildly surprisig that the mea is so 6

.- Table l. Empirical Error Probabilities ad Expected 95-Percet Cofidece Iterval Legths (multiplied by ~) - 10,. 20 Estimator Normal Logistic D-Exp T3 Slash Normal Logistic D-Exp T3 Slash a Empirical Error Probabilities Mea. 054.048.045.039.022.055.046.049.042.020 SQRT.039.033.030.029.017.048.043.040.040.026 Hk"l.O.046.040.035.036.031.055.048.045.048.043 Hk-1.5.059.050.046.044.033.056.050.048.049.040 Hk*"l.5.060.053.048.046.036.058.053.050.051.039 b. Expected 95-Percet Cofidece Iterval Legths (multiplied by.~) Mea 4.39 4.34 4.25 6.88 193.79 4.13 4.09 4.06 6.62 128.90 SQRT 4.93 4.69 4.34 6.73 84.35 4.45 4.17 3.73 5.76 32.15 Hk..l.O 4.97 4.65 4.14 6.28 14.60 4.41 4.08 3.54 5.34 11.36 Hk"l.5 4.51 4.30 3.94 5.95 14.61 4.21 3.99 3.62 5.37 12.22 Hk*-1.5 4.45 4.22 3.88 5.84 14.81 4.16 3.93 3.60 5.32 12.62 7

close to.05 from ormal to T3, though the ECIL are expectedly large for e heavy tails. SQRT seems to perform worst over all. Table 2 represets Mote Carlo estimates of the percetiles of T cr. The percetiles ted to be larger tha those of a t distributio for the ormal ad geerally smaller for the heavier-tailed distributios. For = 20 ad a =.05 all estimates except for SQRT ad the mea evaluated at the slash distributio are very close to t. = 1.73 (we should ote 05 that the method of calculatig the estimated percetiles resulted i cosiderable error i the secod decimal place). 5. REGRESSION..- X. ~ P Lc..6.+U. j=l ~J J ~ The Huber (1973) regressio model is where E (U.) = 0 ad the c.. are kow coefficiets. Let ~ ~J (8,... l,8,a) p be solutios of p L W(X.- L c. 8 )c.. = 0 i=l cr ~ k=l ~ k k ~J j l,p, 1 2 p (-p) L W(X.- L c. k 8 k ) = s. i=l cr ~ k=l ~ Defie Q (t),r p p... = - E ~A(X.- E c_ k 8 k -c.. t)c. i=l cr ~ k=l ~ ~r ~r k~r r = l,p. 8

The 8 = Q-I (0) ad r,r p(q-i (a) < 8,r r < Q-I (b» = P(a < Q (8) < b).,r -,r r By Taylor expasio i 6 r 8 we fid r p = - L WA(X.- L c. k 8 k + c. (6-8)c. i=l 0 ~ k=l ~ ~r r r ~r PI, p A * 2 = - L WA(X.- L c. k 8 )c. + ~ k L WA(X.- L c. 8 +c 8 )c. k (8-8 ) i=l 0 ~ i=l ~ ~r 0 i=l 0 ~ k=l ~ k i r ~r r r The first term i the above expressio is 0 ad 8*...L-> 0 if -e e --p-> 8 r r Thus, uder suitable regularity coditios, asymptotically ormal with mea 0 ad variace -~ ~ (8) is,r r A approximate cofidece iterval for 8 r is IQ-I (-t /20 I~), Q-I (t /20 I)], I,r a,r a where The advatage of this method over the usual methods is ot clear (The simplicity of the locatio model is goe!). Note though, that use of W' is ot required ad that for least absolute value regressio the above method circumvets a estimate of f ~-l (~)J 9

e 6. SUMMARY AND CONCLUSIONS A ew procedure for costructig cofidece itervals for a locatio parameter has bee proposed which exploits the mootoicity of a class of ~ fuctios. The distributioal problem is reduced to cosideratio of a t-1ike statistic ad Mote Carlo results verify that "Hubers" perform fairly well over a rage of distributios ad for samples of size = 10 ad = 20...- 10

REFERENCES Adrews, D. F., et al. (1972), Robust Estimatio of Locatio: Survey ad Advaces, Priceto, N. J.: Priceto Uiversity Press. Boos, Deis D. (1977), "Limitig Secod Order Distributios for First Order Fuctioa1s, with Applicatio to L- ad M-Statistics," Istitute of Statistics Mimeo Series #1152, North Carolia State Uiversity, Raleigh, N. C. Gross, Ala M. (1976), "Cofidece Iterval Robustess with Log-Tailed Symmetric Distributios," Joural of the America Statistical Associatio 3 71, 409-416. (1977), "Cofidece Itervals for Bisquare Regressio Estimates," Joural of the America Statistical Associatio 3 72, 341-354. Huber, Peter J. (1970), "Studetizig Robust Estimates," i Noparametric Techiques i Statistical Iferece, ed. Mada L. Puri, Cambridge: Cambridge Uiversity Press, 453-463. (1973), "Robust Regressio: Asymptotics, Cojectures, ad Mote Carlo," Aals of Statistics, 1, 799-821. Shorack, Gale R. (1976), "Robust Studetizatio of Locatio Estimates," Statistica Neerladica, 30, 119-142... 11