Median as a Weighted Arithmetic Mean of All Sample Observations

Similar documents
CHAPTER VI Statistical Analysis of Experimental Data

Econometric Methods. Review of Estimation

Chapter 5 Properties of a Random Sample

Lecture 3. Sampling, sampling distributions, and parameter estimation

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Chapter 8. Inferences about More Than Two Population Central Values

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

Lecture Notes Types of economic variables

MEASURES OF DISPERSION

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Summary of the lecture in Biostatistics

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

Module 7: Probability and Statistics

Analysis of Variance with Weibull Data

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

22 Nonparametric Methods.

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Functions of Random Variables

Module 7. Lecture 7: Statistical parameter estimation

A New Family of Transformations for Lifetime Data

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Third handout: On the Gini Index

Class 13,14 June 17, 19, 2015

X ε ) = 0, or equivalently, lim

Descriptive Statistics

Chapter 5 Elementary Statistics, Empirical Probability Distributions, and More on Simulation

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Simulation Output Analysis

Chapter 3 Sampling For Proportions and Percentages

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

STK4011 and STK9011 Autumn 2016

Chapter -2 Simple Random Sampling

PROPERTIES OF GOOD ESTIMATORS

Chapter -2 Simple Random Sampling

(Monte Carlo) Resampling Technique in Validity Testing and Reliability Testing

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

ESS Line Fitting

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Random Variables and Probability Distributions

Chapter 8: Statistical Analysis of Simulated Data

Parameter, Statistic and Random Samples

ENGI 3423 Simple Linear Regression Page 12-01

Continuous Distributions

Bayes (Naïve or not) Classifiers: Generative Approach

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Introduction to local (nonparametric) density estimation. methods

Chapter 14 Logistic Regression Models

Quantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

To use adaptive cluster sampling we must first make some definitions of the sampling universe:

Some Notes on the Probability Space of Statistical Surveys

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Simple Linear Regression

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Statistics MINITAB - Lab 5

Lecture 8: Linear Regression

Permutation Tests for More Than Two Samples

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Measures of Dispersion

This lecture and the next. Why Sorting? Sorting Algorithms so far. Why Sorting? (2) Selection Sort. Heap Sort. Heapsort

A WEIGHTED LEAST SQUARES METHOD FOR ESTIMATING THE SUCCESS RATE IN CLUSTERED BINARY DATA

Handout #1. Title: Foundations of Econometrics. POPULATION vs. SAMPLE

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

PTAS for Bin-Packing

1 Onto functions and bijections Applications to Counting

D KL (P Q) := p i ln p i q i

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Lesson 3. Group and individual indexes. Design and Data Analysis in Psychology I English group (A) School of Psychology Dpt. Experimental Psychology

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Special Instructions / Useful Data

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

1 Solution to Problem 6.40

On the Link Between the Concepts of Kurtosis and Bipolarization. Abstract

Logistic regression (continued)

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

On Fuzzy Arithmetic, Possibility Theory and Theory of Evidence

PROJECTION PROBLEM FOR REGULAR POLYGONS

Qualifying Exam Statistical Theory Problem Solutions August 2005

Lecture 1 Review of Fundamental Statistical Concepts

GENERALIZED METHOD OF MOMENTS CHARACTERISTICS AND ITS APPLICATION ON PANELDATA

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

The expected value of a sum of random variables,, is the sum of the expected values:

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

Transcription:

Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of meda (a well kow measure of cetral tedecy) s that t s ot computed by corporatg all sample observatos. That s so because f the sample x = ( x, x,..., x ), where the varate values are ordered such that x x... x the meda( x) = ( xk + x+ k ) / ; k = t(( + ) / ). Here t(.) s the teger value of (.). For example t( (+)/ < ) =. Ths formula, although queer ad expressed a lttle roudabout way, apples uformly whe s odd or eve. Evdetly, meda( x) s ot obtaed by corporatg all the values of x, ad so the alleged weakess of the meda as a measure of cetral tedecy.. The Meda Mmzes the Absolute Norm of Devatos: It s a commoplace kowledge Statstcs that the statstc x (the arthmetc mea of x) mmzes the (squared) Eucldea orm of devatos of the varate values from tself or explctly stated, t mmzes S = x c sce S attas ts mmum whe c = x. To obta ths result, oe may = mmze S (the Eucldea orm per se) also. O the other had the meda mmzes the Absolute orm of devatos of the varate from tself, expressed as whch = M = x c yelds c = meda( x). I a geeral framework, we obta arthmetc mea or meda by p mmzg the geeral Mkowsk orm x c for p= or p= respectvely. Ths = vew of the arthmetc mea ad the meda gves them the meag of beg the measures of cetral tedecy. 3. Idetermacy of Meda whe the Number of Values the Sample s Eve: Whe the sample x = ( x, x,..., x ), the umber of observatos,, s odd, the value of meda( x) = ( xk + x+ k ) / ; k = t(( + ) / ) s determate; xk = x + k mmzes the absolute orm, M. However, whe s a eve umber, x k ad x + k are (very ofte) dfferet. As a matter of fact, ay umber z for whch the relatoshp ( xk z x + k ) holds, mmzes the absolute orm of devatos. Thus, the meda s determate. It has bee customary, therefore, that absece of ay other relevat formato, oe uses the prcple of suffcet reaso ad obtas meda( x) = ( xk + x + k ) /. However, t remas a truth that ay umber z for whch the relatoshp ( xk z x + k ) holds, s the value of the meda as much as z = ( xk + x + k ) /. / p

4. Meda as a Weghted Arthmetc Mea of Sample Observatos: If x = ( x, x,..., x ) are ordered such that x x... x, t s possble to express meda as a weghted arthmetc mea x jwj wjwhere j = w + j j= j= =.5 for j = t(( + ) / ) else w j = for j t(( + ) / ). However, ths s trval. Now we preset a o-trval alteratve algorthm to obta meda( x ). I order to use ths algorthm t s ot ecessary that the values of x be arraged a ascedg (or descedg) order, that s x x... x codto s relaxed. The steps the algorthm are as follows: () Set w = =,,...,. Obvously, = = w =. xw = () Fd v =, the weghted arthmetc mea of ( x, x,..., x ). w () Fd ew w = f d = x v ε ( ε > s a small umber, say.), d else w =.or ay such small umber; =,,...,. xw = (v) Fd v = w = usg the weghts obtaed () above. (v) If v v τ (where τ s a very small umber, say,. or so, cotrollg the accuracy of result) the v s replaced by v (that s, v s reamed as v ) ad go to step (); else (v) Meda s v ad w = ( w, w,..., w ) are the weghts assocated wth ( x, x,..., x ). Stop. Ths algorthm yelds o-trval weghts w = ( w, w,..., w ). It yelds the meda detcal to that obtaed by the covetoal formula f s odd. If s eve, t gves a umber z : ( xk z x + k ), whch s meda as metoed secto. 5. Some Mote Carlo Expermets: We have coducted some Mote Carlo expermets to study the performace of the alteratve method (weghted arthmetc mea represetato) vsà-vs the covetoal method of obtag meda. Three sample szes (of =, ad 5) have bee cosdered. Samples have bee draw from fve dstrbutos (Normal, Beta, Beta, Gamma ad Uform). I each case, expermets have bee carred out. A success of the alteratve estmator s there f t obtas meda detcal to that obtaed by the covetoal method case s odd ad obtas meda = z : ( xk z x + k ) case s eve. The summary of results s preseted table.

3 Dstrb. Uform Gamma Beta Beta Normal Table. Performace of the Alteratve Method to obta Meda Sample Arthmetc Meda Meda Iclato Sze = Mea (Covetoal) (Alteratve) to Mea Success Rate (%) 5.49 5.47 5.7 49.98979 5.35 5.35 5 49.995 5.57 5.789.563.357.57886 Yes.5647.34.345 5.5656.745.5 Yes 5.6643 5.4544 5.4756 5.6337 5.655 5.655 5 5.64 53.496 5.887 3343.83487 64.9 74.894 Yes 3346.48 56.964 56.964 5 3346.43339 5.937 59.6473 Yes.6 -.49.596.4474.3999.3999 5.77 -.95 -.4733 We fd that whe s odd, rrespectve of the dstrbuto or the sample sze both the methods yeld detcal results. Whe the dstrbuto s skewed (.e. there s a sgfcat dvergece betwee meda ad mea) ad s eve, the alteratve meda s slghtly pulled by the mea (ts clato s towards the mea). Ths appears justfed because t s expected that the values lyg betwee x k ad x + k (for x = ( x, x,..., x ) : x x... x ; k = t(( + ) / ) ) must be more desely dstrbuted the sde of the mea. The covetoal method, however, cosders them uformly dstrbuted wat of formato. The alteratve method appears to explot the formato cotaed the sample. 6. Aalyss of Iclato of Computed Medas to Mea Value: We have see that whe s a eve umber, the values of meda estmated by the two methods dffer ad the oe estmated by the alteratve (weghted arthmetc mea) method appears to be pulled towards the mea value, x. The, a questo arses : s the meda estmated by the alteratve method based (towards the mea)? To vestgate to ths questo, we geerate some a values ( our expermet 8) of v such that ( xk v x + k ), ad v follows the dstrbuto detcal to that of x. We do t aga ad aga for a large umber of tmes ( our expermet,,). We cout as to how may tmes the v < the meda values obtaed by the two competg methods. The probablty of v =computed medas s very small ( our expermet we ever ecoutered equalty). Table- clearly shows that case of Gamma ad Beta dstrbutos both medas are pulled by mea, though the meda obtaed by the alteratve method s more cled to mea. The pull s stroger case of the Gamma dstrbuto, sce t s more skewed tha the Beta dstrbuto. I case of ormal dstrbuto we fd the opposte tedecy (push). I case of uform dstrbuto o pull or push force s observed, whle case of Beta dstrbuto a mxed observato s there.

4 Table. Iclato of the Competg Methods to the Mea Value Dstrb. Sample a (o. of Meda Iclato Meda Iclato Sze = v values (Covetoal) to Mea (Alteratve) to Mea geerated) Uform 8.4993 No.49975 No 5 8.4993 No.588 No Gamma 8.9367 + Yes.97439 + + Yes 5 8.9367 + Yes.98637 + + Yes Beta 8.54 No.578 No 5 8.537 No.46743 - Yes Beta 8.9886 + Yes.987 + Yes 5 8.9888 + Yes.98743 + Yes Normal 8.884 - - Yes.856 - Yes 5 8.88433 - - Yes.788 - Yes + pull; + + stroger pull; - push; - - stroger push 7. Relatve Effcecy ad Cosstecy of the Competg Methods: Now suppose we geerate a large ( our expermet 5) umber of varate values followg a specfed dstrbuto. Let us call the collecto of these values U or the Uverse. We may obta the Meda(U) = µ, say. Ths value may ot be the true meda of the dstrbuto (or f U were very large), but t s lkely to be very close to that. Dstrb. Uform Gamma Beta Beta Normal Table 3. Effcecy of the Competg Methods to obta Meda Sample True Computed Sze = Meda Meda Norm Computed Meda Norm U(5) ( m ) (ref. m ) ( m ) (ref m ) 49.5468 5.4744 8.733 5.583 948.4647 5 49.5468 49.778 9.58568 49.8445 99.679 9 49.5468 49.59458 47.33987 49.656 43.96779.88.379 64.9.65 75.3577 5.88.66 6.4453.6538 6.9757 9.88.75.6366.38.764 5.4387 46.36445 89.6945 46.3985 6644.6 5 5.4387 53.8799 885.476 53.435 786.8595 9 5.4387 48.385 373.5866 48.698 344.8679 55.647 89.5683 53555.9667 58.66 776.756 5 55.647 574.58766 46.38 68.486 439.7566 9 55.647 53.3939 578.37985 539.864 654.75995 -.7553 -.68537 99.56588 -.8465 667.7753 5 -.7553 -.9968 8.4875 -.844 68.638 9 -.7553 -.883.75 -.8 7.4995

5 From U we may draw some ( our case, 5 ad 9) radom values, say x = ( x, x,..., x ), compute medas ( m ad m ) by the two competg methods (respectvely) aga ad aga. I our case, tral=, wth replacemet. I each draw, the computed medas wll dffer from µ. From ths, we may obta the orms for each meda. These orms would suggest whch meda s most frequetly closer to µ. Symbolcally, tral p p, t ; = t = for tradtoal Normt = m µ t = for alteratve We have used the absolute orm (p = the formula defg orm). The results of the expermets are gve table 3. We observe that for Uform, Normal ad Beta dstrbutos orm s smaller tha orm. For Gamma ad Beta dstrbutos the opposte s true. We also observe that the orms are smaller for larger values of, dcatg to cosstecy. 8. Asymmetry of Dstrbuto ad Effcecy of the Competg Methods: It s well kow that the Gamma dstrbuto s severely skewed for small shape parameters, but wth the creasg value of that parameter, the dstrbuto teds to become symmetrc. Table 4. Asymmetry of Dstrbuto ad Effcecy of the Competg Methods Dstrbuto Sample True Computed Gamma Sze = Meda Meda Norm Computed Meda Norm (shape parameter) U(5) ( m ) (ref. m ) ( m ) (ref m ) Gamma(.5).88.379 64.9.65 75.3577 5.88.66 6.4453.6538 6.9757 9.88.75.6366.38.764 Gamma(.) 3.4779 3.6356 5.399 3.97688 5.85 5 3.4779 3.54.7396 3.6347.36 9 3.4779 3.43676 4.85865 3.4843 5.49 Gamma(.) 8.33893 8.6384 86.3677 9.53 9.568 5 8.33893 8.47 8.77 8.5397 8.9836 9 8.33893 8.3935 8.833 8.4876 8.394 Gamma(4.) 8.4966 8.7355 87.69966 9.497 84.44678 5 8.4966 8.493 6.9773 8.5976 6.3634 9 8.4966 8.3383.55354 8.45.67358 Gamma(8.) 37.6569 38.4688 43.66793 38.9338 4.379 5 37.6569 37.5486 37.978 37.867 36.79947 9 37.6569 37.44675 5.6999 37.6967 5.979 Gamma(6.) 8.39965 8.758 5.958 8.59877 45.839 5 8.39965 8.394 4.79 8.5754 39.5575 9 8.39965 8.349 7.338 8.336 7.6 Gamma(5.) 49.3 48.446 86.735 48.994 836.985 5 49.3 48.77 8.8687 48.9874 8.44 9 49.3 48.89959 38.9887 49.58 38.433

6 Table 4 shows the relatve orms for the competg methods due to creasg values of the shape parameter of the Gamma varate. We observe that orm becomes uformly smaller (tha orm ) whle the shape parameter reaches 6. Ths expermet reforces our cocluso that the alteratve method of obtag meda s better tha the covetoal method whle the dstrbuto s less asymmetrc. 9. Cocluso: Ths study establshes that meda may be expressed as a weghted arthmetc mea of all sample observatos. If the covetoal formula does ot corporate all sample values, t s the property of the specfc method of computato ad ot of meda per se, as ofte alleged to t. If our expermets covey somethg, the we may also state that for relatvely more symmetrc dstrbutos the alteratve formula (weghted mea) performs better tha the covetoal method. But for heavly asymmetrc dstrbutos the covetoal method of computg meda performs better, although both the methods yeld based estmates. The alteratve algorthm of computato s easly exteded to other meda type estmators - such as Least Absolute Devato (LAD) estmator of the regresso model y = X β + u - as show by Far (974) ad Schlossmacher (973). Refereces Far, RC (974). O the Robust Estmato of Ecoometrc Models, Aals of Ecoomc ad Socal Measuremet, 3 ( 667-677). Schlossmacher, EJ (973). A Iteratve Techque for Absolute Devatos Curve Fttg, Joural of the Amerca Statstcal Assocato, 68 (857-859).