Measures of Dispersion

Similar documents
MEASURES OF DISPERSION

Statistics Descriptive

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Summary of the lecture in Biostatistics

Measures of Central Tendency

Lesson 3. Group and individual indexes. Design and Data Analysis in Psychology I English group (A) School of Psychology Dpt. Experimental Psychology

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

is the score of the 1 st student, x

CHAPTER VI Statistical Analysis of Experimental Data

Lecture Notes Types of economic variables

Lecture 3. Sampling, sampling distributions, and parameter estimation

Descriptive Statistics

Continuous Distributions

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Chapter 8. Inferences about More Than Two Population Central Values

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

Chapter 5 Properties of a Random Sample

Lecture 1 Review of Fundamental Statistical Concepts

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

Handout #1. Title: Foundations of Econometrics. POPULATION vs. SAMPLE

The variance and standard deviation from ungrouped data

Econometric Methods. Review of Estimation

ENGI 4421 Propagation of Error Page 8-01

Summary tables and charts

ESS Line Fitting

Class 13,14 June 17, 19, 2015

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Multiple Choice Test. Chapter Adequacy of Models for Regression

Point Estimation: definition of estimators

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

Third handout: On the Gini Index

Arithmetic Mean Suppose there is only a finite number N of items in the system of interest. Then the population arithmetic mean is

Functions of Random Variables

L5 Polynomial / Spline Curves

Simple Linear Regression

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Chapter Statistics Background of Regression Analysis

STATISTICS 13. Lecture 5 Apr 7, 2010

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Statistics MINITAB - Lab 5

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Quantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Chapter Two. An Introduction to Regression ( )

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Simulation Output Analysis

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

The Mathematical Appendix

Log1 Contest Round 2 Theta Complex Numbers. 4 points each. 5 points each

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Median as a Weighted Arithmetic Mean of All Sample Observations

CHAPTER 2. = y ˆ β x (.1022) So we can write

Simple Linear Regression

Multiple Linear Regression Analysis

Correlation and Simple Linear Regression

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

Chapter 11 Systematic Sampling

Can we take the Mysticism Out of the Pearson Coefficient of Linear Correlation?

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

Parameter, Statistic and Random Samples

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Analysis of Variance with Weibull Data

ENGI 3423 Simple Linear Regression Page 12-01

Chapter 5 Elementary Statistics, Empirical Probability Distributions, and More on Simulation

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Simple Linear Regression - Scalar Form

Laboratory I.10 It All Adds Up

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

X ε ) = 0, or equivalently, lim

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Chapter 3 Sampling For Proportions and Percentages

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Chapter -2 Simple Random Sampling

PROPERTIES OF GOOD ESTIMATORS

CHAPTER 3 POSTERIOR DISTRIBUTIONS

Chapter 13 Student Lecture Notes 13-1

Module 7: Probability and Statistics

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

CHAPTER 4 RADICAL EXPRESSIONS

Chapter 4 Multiple Random Variables

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Objectives of Multiple Regression

Lecture 2: Linear Least Squares Regression

Chapter -2 Simple Random Sampling

Chapter 9 Jordan Block Matrices

STK4011 and STK9011 Autumn 2016

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1

UNIT 1 MEASURES OF CENTRAL TENDENCY

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Transcription:

Chapter 8 Measures of Dsperso Defto of Measures of Dsperso (page 31) A measure of dsperso s a descrptve summary measure that helps us characterze the data set terms of how vared the observatos are from each other. A small value dcates that the observatos are ot too dfferet from each other; that s, there s a cocetrato of observatos about the ceter of the dstrbuto. O the other had, a large value dcates that the observatos are very dfferet from each other or they are wdely spread out from the ceter. The smallest possble value of a measure of dsperso should be 0. A zero measure should dcate the absece of varato. 1

Illustrato A={98, 98, 99, 99, 99,100, 100,100, 100, 100, 100, 100, 101, 101, 101, 10, 10} B={0, 30, 40, 50, 60, 70, 80, 90, 100, 110, 10, 130, 140, 150, 160, 170, 180} The meas of both collectos are equal to 100. Ther medas are also equal to 100. Whch collecto must have a hgher measure of dsperso? For whch collecto s the mea a more relable measure of cetral tedecy? (Relable the sese that f we repeatedly select a observato at radom from the collecto, ts value s usually ot too dfferet from the mea.) The measure of dsperso serves as a measure of the relablty of the mea or meda as measures of cetral tedecy. Geeral Classfcatos of Measures of Dsperso (page 3) Measures of Absolute Dsperso A measure of absolute dsperso has the same ut as the observatos. (Examples: rage, terquartle rage, stadard devato) Measures of Relatve Dsperso A measure of relatve dsperso has o ut ad s therefore useful comparg the varablty of oe dstrbuto wth aother dstrbuto. (Example: coeffcet of varato)

Defto of Rage (page 3) Defto 8.1 The rage s the dstace betwee the maxmum value ad the mmum value. Rage = Maxmum Mmum Sometmes the rage s preseted by statg the smallest ad the largest values. Example 8.1 (page 33) Gve the weghts of 5 rabbts ( pouds), fd the rage. 8 pouds 10 pouds 1 pouds 14 pouds 15 pouds lghtest heavest Soluto: The maxmum s 15 pouds ad the mmum s 8 pouds. Thus, the rage of the weghts of the rabbts s Rage = maxmum mmum = 15-8 = 7 pouds We ca also say that the weghts of the rabbts rage from 8 to 15 pouds. 3

Approxmatg the Rage from the FDT (pages 34-35) Rage = UCLHCl LCLLCl where UCL HCl = upper class lmt of the last class LCL LCl = lower class lmt of the frst class Example 8.5: Age ( years) o. of wome Lowest Class Iterval 5-9 7 10-14 10 15-19 13 0-9 8 30-34 5 Hghest Class Iterval 35-39 3 64 Rage = 39-5 = 34 years Characterstcs of the Rage (page 35) It s a smple, easy-to-compute ad easy-to-uderstad measure. Weakesses: It fals to commucate ay formato about the clusterg or the lack of clusterg of the values the mddle of the dstrbuto sce t uses oly the extreme values (mmum ad maxmum). A outler ca greatly affect ts value. It teds to be smaller for smaller collectos tha for larger collectos. It caot be approxmated from frequecy dstrbutos wth a ope-eded class. It s ot tractable mathematcally. 4

Defto of the Iterquartle Rage (IQR) The terquartle rage (IQR) s the dfferece betwee the thrd ad frst quartles of the data set. That s, IQR = Q 3 Q 1 Remarks about the IQR: The terquartle rage reflects the varablty of the mddle 50% of the observatos the array. The IQR may be vewed as the rage for a trmmed data set where the smallest 5% ad the largest 5% of observatos have bee removed. Ths modfed rage addresses the weakess of the rage s sestvty towards outlers. A shortcomg of the IQR s that t could be 0 eve f there s stll some varato amog the smallest 5% ad largest 5% of all observatos. 5

Example A ecoomst studyg the varato famly comes a commuty foud that the frst quartle come s P36,500 ad the thrd quartle come s P10,000. Fd the terquartle rage. IQR = Q 3 Q 1 =10,000-36,500 = P83,500 Defto of the Varace (page 37) Defto 8.. The populato varace s the mea of the squared devatos betwee each observed value ad the mea. 1 Populato Varace: where s the measure take from the th elemet of populato s the populato mea s the populato sze 1 Sample Varace : s 1 where s the measure take from the th elemet of sample s the sample mea s the sample sze 6

Remarks: (page 37) The populato varace s a parameter whle the sample varace s a statstc. The squared dfferece of a observato from the mea gves us a dea o how close ths observato s to the mea. A large squared dfferece dcates that the observato ad the mea are far from each other whle a small squared dfferece dcates that the observato ad the mea are close to each other. A small varace dcates that the observatos are hghly cocetrated about the mea so that t s approprate to use the mea to represet all of the values the collecto. The sample varace s ot the mea of the squared devatos of the observatos from the mea. The deomator of the sample varace s ot (the sze of the sample); rather, t s (-1). The reaso for usg (-1) as the dvsor s that Iferetal Statstcs, the correspodg statstc wth as the dvsor teds to uderestmate the populato varace. Usg the dvsor (-1) s used to make up for ths tedecy to uderestmate. The ut of the varace s the square of the uts of measures the data set. Thus, strctly speakg, the varace s ot a measure of absolute dsperso. Ofte, t s desrable to retur to the orgal uts of measure ad so t s the stadard devato that s preseted. Defto of Stadard Devato (page 38) The stadard devato s the postve square root of the varace. The populato stadard devato for a fte populato wth elemets, deoted by the Greek letter (lower case sgma) s:. 1 The sample stadard devato for a sample wth elemets, deoted by the letter s s: s 1 1 7

Example 8.7 (pages 39-40) Gve the IQ of 7 studets the sample, compute for the sample stadard devato. Let = weght of th studet the sample, = o.of studets = 7. ( x = 107) 107 100-7 49 99-8 64 110 3 9 105-4 11 5 5 107 0 0 116 9 81 s = 107 7 1 ( 107) 7-1 ( x 107 3 ) 3 6. 6 Computatoal Formula of the Varace (page 41) If the mea s a rouded fgure the the propagato of roudg errors s very fast whe we use the deftoal formula to compute the varace. We ca avod ths by usg the followg computatoal formula for the varace: Populato Varace: 1 1 Sample Varace: s 1 1 ( 1) 8

Proof ( ) ( ) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Example (page 4) Usg the same data set o the IQ of 7 studets the sample, we compute the sample stadard devato usg the computatoal formula. 7 1 100 10000 99 9801 110 1100 105 1105 11 1544 107 11449 116 13456 7 749 1 80375 7 7 7 1 1 7(80375) (749) s 6. 7(7 1) 7(6) 9

Mathematcal Propertes of the Stadard Devato (page 47) Property 1: If each observato of a set of data s trasformed by the addto (or subtracto) of a costat c to each observato, the stadard devato of the ew set of data s the same as the stadard devato of the orgal data set. Proof: Orgal Data = { 1,,, } Trasformed Data = {Y 1, Y,, Y } where Y = + c s Y ( Y Y ) ( c) ( c) 1 1 1 1 1 ( ) 1 s Mathematcal Propertes of the Stadard Devato (page 48) Property : If each observato of a set of data s trasformed by the multplcato (or dvso) of a costat c to each observato, the stadard devato of the ew set of data s equal to the stadard devato of the orgal data multpled (or dvded) by c. Proof: Orgal Data = { 1,,, } Trasformed Data = {Y 1, Y,, Y } where Y = c s Y ( Y Y ) ( c ) ( c ) 1 1 1 1 c( ) c ( ) 1 1 c sx 1 1 c s x 10

Example The weghts ( mllgrams) of ats a sample are as follows: Sample data = {3, 8, 35, 1, 50} Its mea s 19.4 mg. ad ts stadard devato s 1.89 mg. If each measuremet s coverted to grams (1000 mg=1 g), what wll be the ew mea? the stadard devato? If each at gaed mg. weght, what wll be the ew mea? the stadard devato? Characterstcs of the Stadard Devato (page 47) It uses every observato ts computato. It may be dstorted by outlers. Ths s because squarg large devatos from the mea wll gve more weght to these outlers. It s ameable to algebrac treatmet. It s always oegatve. A value of 0 mples the absece of varato. Level of measuremet must at least be terval for the stadard devato to be terpretable. 11

Approxmatg the Varace from FDT (page 43-44) Populato Varace: Sample Varace: s k 1 k 1 k k f f 1 1 f ( ) k k f f 1 1 f ( ) 1 ( 1) where = class mark of th class terval f = frequecy of th class terval k = umber of classes = umber of observatos the populato = umber of observatos the sample = populato mea = sample mea Example Gve the frequecy dstrbuto of daly come pesos receved by a sample of 85 wome, approxmate the mea ad stadard devato. Class Lmts f f f 00 49 4 4.5 898.0 01,601.00 50-99 10 74.5,745.0 753,50.50 300-349 14 34.5 4,543.0 1,474,03.50 350-399 5 374.5 9,36.5 3,506,56.5 400-449 13 44.5 5,518.5,34,603.5 450-499 10 474.5 4,745.0,51,50.50 500-550 9 54.5 4,70.5,475,90.5 TOTALS 85 3,53.5 13,005,571.5 7 f 1 3,53.5 38.735 7 85 f 1 7 7 f f 1 1 (85)(13,005,571.5) (3,53.5) s 81.3 pesos ( 1) (85)(84) 1

Baayme-Chebyshev Rule (page 49) The percetage of observed values that fall wth dstaces of k stadard devatos below ad above the mea must be at least 1 100%, whatsoever the shape of the data dstrbuto. The value k s ay umber greater tha 1. 1 k The Beayme-Chebyshev rule gves us a dea how to characterze the data set usg the stadard devato. We ow state ths rule for k =, 3, ad 4 as follows: 1 1 3 At least 1 100% 1 100% 100% 75% 4 of the observed values fall wth dstaces 4 of stadard devatos below ad above the mea. 1 At least 1 100% 88.89% of the observed values fall wth dstaces of 3 stadard 3 devatos below ad above the mea. 1 At least 1 100% 93.75% of the observed values fall wth dstaces of 4 stadard 4 devatos below ad above the mea. Example 8.15 (page 49) Suppose you have formato that the mea weght of all female employees a maufacturg dustry s 10 pouds ad the stadard devato of the weghts s 8 pouds. Use the Beayme-Chebyshev rule to determe the terval cotag at least 75% of all the measures the populato. 1 Soluto: We solve for k from the equato, 0.75 = 1-. We fd that k= so that the terval k we are lookg for s: = 10 ()(8) = 10 16. That s, at least 75% of all female employees have weghts ragg from 104 to 136 lbs. Questo: If the stadard devato had bee smaller, say 4 lbs, whch terval wll cota at least 75% of all the measures the populato? 13

Comparg the Varato of Observatos of or More Dstrbutos Cosder the followg sample of weghts of ats mllgrams= {3, 8, 35, 1, 50}. Its stadard devato s 1.89 mg. Ths tme cosder ths sample of weghts of elephats grams={6000000, 5999999, 5999998, 6000001, 600000}. Its stadard devato s 1.581 grams. Ca we use the stadard devatos of the two collectos to compare the varato of the observatos of these two collectos? We caot use measures of absolute dsperso to compare the varato of the observatos of two or more collectos whe () the uts are dfferet or () the meas are very dfferet from each other. Defto of Measures of Relatve Dsperso Measures of relatve dsperso are measures of dsperso that have o ut of measuremet ad are used to compare the scatter of oe dstrbuto wth the scatter of aother dstrbuto. 14

Defto of Coeffcet of Varato (page 53) The coeffcet of varato (CV) s a measure of relatve dsperso ad s defed as: populato CV = x100% where s the populato stadard devato s the populato mea s sample CV = x 100% x where s s the sample stadard devato x s the sample mea ote: The coeffcet of varato descrbes the stadard devato as a percetage of the mea. (Example: CV=10% dcates that the stadard devato s 10% of the mea). Cosequetly, the CV s ot terpretable whe the mea s egatve ad s udefed whe the mea s 0. Example 8.17b (page 54) Suppose we get the prces of a 80-gram pack of a certa brad of cracker uts at 0 dfferet grocery stores. The mea prce of the 0 packs of cracker uts s P9.50 wth a stadard devato of P0.6. O the other had, the weghts of the 0 packs of cracker uts have a mea cotet of 8 grams wth a stadard devato of 3.5 grams. Ca we say that prces are more varable tha weght? Soluto: 0.6 CV prce = 100 6.3% 9.5 x CV weght = 3.5 100 4.7% 8 x 15

Example 8.17c (page 55) The foreg exchage rate s a dcator of the stablty of the peso ad a dcator of the ecoomc performace. The level of the peso s depedet o the market forces ad ot o govermet polcy. The govermet tervees oly through the Bagko Setral g Plpas whe there are speculatve elemets the market. Gve below are the meas ad stadard devatos of the quarterly P-$ exchage rate for the perods 1998 to 1999 ad 000 to 001. Whch of the two perods s the peso more stable? 1998-1999 000-001 Mea P40.4 P48.6 Stadard Devato P.01 P1.1 Soluto:.01 1.1 CV 98-99 = x100% 4.98% 40.4 CV 00-01 = x100%.49% 48.6 Aother mportat measure: The Stadard Score (page 50) The stadard score or z-score dcates the relatve posto of a observato the collecto where the observed value came from It s used to compare two values from dfferet collectos that (1) dffer wth respect to or s, or both, or () are expressed dfferet uts. It s also used to detfy possble outlers. As a rule, f the stadard score > 3, the t s marked as a possble outler. Whe all the observatos a collecto are stadardzed the the mea ad stadard devato of ths collecto of stadard scores are 0 ad 1, respectvely. 16

Defto of Stadard Score (page 50) Defto 8.4. The stadard score or z-score measures how may stadard devatos a observed value s above or below the mea. Populato Z-score = Sample Z-score = - where s the populato mea s the populato stadard devato - s where x s the sample mea s s the sample stadard devato Remarks (page 50) A postve z-score measures the umber of stadard devatos a observato s above the mea, ad a egatve z-score measures the umber of stadard devatos a observato s below the mea. A z-score of 0 meas that the observato s equal to the mea. The z-score has o ut whch makes t possble to compare the z-scores computed usg dfferet collectos. 17

Example 8.16 (page 50-51) The mea grade Statstcs 101 s 70% ad the stadard devato s 10%; whereas Math 17, the mea grade s 80% ad the stadard devato s 0%. a) Mark got a grade of 75% Statstcs 101 ad a grade of 90% Math 17. I whch subject dd Mark perform better f we cosder the grades of the other studets the two subjects? 75 70 90 80 Soluto: z Stat 101 0.5, z Math 17 0.5 10 0 If we cosder the grades of the other studets the two subjects, Mark s score Stat 101 s just as good as hs score Math 17. Based o the z scores, Mark s scores both subjects are 0.5 stadard devatos above ther respectve mea scores. b) Peter got a grade of 70% both Statstcs 101 ad Math 17. I whch subject dd Peter perform better f we cosder the grades of the other studets the two subjects? 70 70 70 80 Soluto: z Stat 101 0, z Math 17 0.5 10 0 Peter relatvely performed better Statstcs 101. Based o the z scores, Peter s score Stat 101 s equal to the mea score Stat 101, whle hs score Math 17 s 0.5 stadard devatos below the mea. c) Paul got a grade of 100% Stat 101. Compute for the z score ad terpret. 100 70 Soluto: z 3 10 Paul s score s above the mea. Its dstace from the mea s thrce the sze of the stadard devato. We may cosder ths a uusually hgh score whe compared wth other Stat 101 grades. Assgmet 1. Usg the data Exercse o. page 51, compute for the followg: a) Mea b) Rage c) Stadard devato usg the stadard devato mode of you calculator d) Stadard devato usg the computatoal formula (show soluto) e) coeffcet of varato f) Use the Baayme-Chebyshev rule to detfy a terval that wll eclose at least 75% of the observatos. g) What percetage of observatos are actually cotaed wth stadard devatos from the mea.. Usg the fdt Exercse o. 4 page 5, approxmate the followg: a) Mea b) Rage c) Stadard devato usg the computatoal formula (show all mportat steps) d) coeffcet of varato 3. Aswer Exercse o. 6, page 5. 4. Aswer Exercse o. 1, page 55. Justfy your aswer wth the approprate statstcs. 18