Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Similar documents
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

MEASURES OF DISPERSION (VARIABILITY)

Chapter 2 Descriptive Statistics

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Median and IQR The median is the value which divides the ordered data values in half.

6.3 Testing Series With Positive Terms

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

ANALYSIS OF EXPERIMENTAL ERRORS

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Analysis of Experimental Measurements

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

Elementary Statistics

Statistics 511 Additional Materials

Random Variables, Sampling and Estimation

Measures of Spread: Standard Deviation

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

11 Correlation and Regression

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Zeros of Polynomials

4.3 Growth Rates of Solutions to Recurrences

1 Inferential Methods for Correlation and Regression Analysis

Economics Spring 2015

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Stat 139 Homework 7 Solutions, Fall 2015

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Measures of Variation

Topic 1 2: Sequences and Series. A sequence is an ordered list of numbers, e.g. 1, 2, 4, 8, 16, or

Linear Regression Demystified

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

a. For each block, draw a free body diagram. Identify the source of each force in each free body diagram.

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Chapter 23: Inferences About Means

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Lecture 24 Floods and flood frequency

Activity 3: Length Measurements with the Four-Sided Meter Stick

Measures of Spread: Variance and Standard Deviation


Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Infinite Sequences and Series

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

1 Lesson 6: Measure of Variation

Section 11.8: Power Series

An Introduction to Randomized Algorithms

The Method of Least Squares. To understand least squares fitting of data.

This is an introductory course in Analysis of Variance and Design of Experiments.

Simple Linear Regression

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Introduction There are two really interesting things to do in statistics.

STP 226 EXAMPLE EXAM #1

SEQUENCES AND SERIES

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Curve Sketching Handout #5 Topic Interpretation Rational Functions

TOPIC 6 MEASURES OF VARIATION

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Chapter 10: Power Series

Chapter 12 Correlation

Chapter 4. Fourier Series

APPENDIX F Complex Numbers

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Alternating Series. 1 n 0 2 n n THEOREM 9.14 Alternating Series Test Let a n > 0. The alternating series. 1 n a n.

Average-Case Analysis of QuickSort

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Parameter, Statistic and Random Samples

Topic 9: Sampling Distributions of Estimators

The Poisson Distribution

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

The Random Walk For Dummies

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

Chapter 6 Overview: Sequences and Numerical Series. For the purposes of AP, this topic is broken into four basic subtopics:

Calculus with Analytic Geometry 2

x a x a Lecture 2 Series (See Chapter 1 in Boas)

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

Lecture 15: Learning Theory: Concentration Inequalities

Probability, Expectation Value and Uncertainty

Topic 9: Sampling Distributions of Estimators

3 Resampling Methods: The Jackknife

Topic 9: Sampling Distributions of Estimators

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

Expectation and Variance of a random variable

The axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c.

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :

MATH/STAT 352: Lecture 15

NAME: ALGEBRA 350 BLOCK 7. Simplifying Radicals Packet PART 1: ROOTS

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Properties and Hypothesis Testing

Dotting The Dot Map, Revisited. A. Jon Kimerling Dept. of Geosciences Oregon State University

MA131 - Analysis 1. Workbook 2 Sequences I

U8L1: Sec Equations of Lines in R 2

SOLUTIONS TO EXAM 3. Solution: Note that this defines two convergent geometric series with respective radii r 1 = 2/5 < 1 and r 2 = 1/5 < 1.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Transcription:

LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the we will go o to the case of a frequecy distributio. The first thig to ote is that, whereas the rage as well as the quartile deviatio are two such measures of dispersio which are NOT based o all the values, the mea deviatio ad the stadard deviatio are two such measures of dispersio that ivolve each ad every data-value i their computatio. You must have oted that the rage was measurig the dispersio of the data-set aroud the mid-rage, whereas the quartile deviatio was measurig the dispersio of the data-set aroud the media. How are we to decide upo the amout of dispersio roud the arithmetic mea? It would seem reasoable to compute the DISTANCE of each observed value i the series from the arithmetic mea of the series. Let us do this for a simple data-set show below: The Number of Fatalities i Motorway Accidets i oe Week: Day Number of fatalities Suday 4 Moday 6 Tuesday Wedesday 0 Thursday 3 Friday 5 Saturday 8 Total 8 Let us do this for a simple data-set show below: The Number of Fatalities i Motorway Accidets i oe Week: Day Number of fatalities Suday 4 Moday 6 Tuesday Wedesday 0 Thursday 3 Friday 5 Saturday 8 Total 8

The arithmetic mea umber of fatalities per day is 8 4 7 I order to determie the distaces of the data-values from the mea, we subtract our value of the arithmetic mea from each daily figure, ad this gives us the deviatios that occur i the third colum of the table below: Day Number of fatalities Suday 4 0 Moday 6 + Tuesday Wedesday 0 4 Thursday 3 1 Friday 5 + 1 Saturday 8 + 4 TOTAL 8 0 The deviatios are egative whe the daily figure is less tha the mea (4 accidets) ad positive whe the figure is higher tha the mea. It does seem, however, that our efforts for computig the dispersio of this data set have bee i vai, for we fid that the total amout of dispersio obtaied by summig the (x x) colum comes out to be zero! I fact, this should be o surprise, for it is a basic property of the arithmetic mea that:the sum of the deviatios of the values from the mea is zero. The questio arises: How will we measure the dispersio that is actually preset i our data-set? Our problem might at first sight seem irresolvable, for by this criterio it appears that o series has ay dispersio. Yet we kow that this is absolutely icorrect, ad we must thik of some other way of hadlig this situatio. Surely, we might look at the umerical differece betwee the mea ad the daily fatality figures without cosiderig whether these are positive or egative. Let us deote these absolute differeces by modulus of d or mod d. This is evidet from the third colum of the table below: d d 4 0 0 6 0 4 4 3 1 1 5 1 1 8 4 4 Total 14

By igorig the sig of the deviatios we have achieved a o-zero sum i our secod colum. Averagig these absolute differeces, we obtai a measure of dispersio kow as the mea deviatio. I other words, the mea deviatio is give by the formula: MEAN DEVIATION: M.D. d i As we are averagig the absolute deviatios of the observatios from their mea, therefore the complete ame of this measure is mea absolute deviatio --- but geerally we just say mea deviatio. Applyig this formula i our example, we fid that: The mea deviatio of the umber of fatalities is 14 M.D.. 7 The formula that we have just cosidered is valid i the case of raw data. I case of grouped data i.e. a frequecy distributio, the formula becomes MEAN DEVIATION FOR GROUPED DATA: M.D. f i x i x fi d i As far as the graphical represetatio of the mea deviatio is cocered, it ca be depicted by a horizotal lie segmet draw below the -axis o the graph of the frequecy distributio, as show below:

f Mea Deviatio The approach which we have adopted i the cocept of the mea deviatio is both quick ad simple. But the problem is that we itroduce a kid of artificiality i its calculatio by igorig the algebraic sigs of the deviatios. I problems ivolvig descriptios ad comparisos aloe, the mea deviatio ca validly be applied; but because the egative sigs have bee discarded, further theoretical developmet or applicatio of the cocept is impossible. Mea deviatio is a absolute measure of dispersio. Its relative measure, kow as the co-efficiet of mea deviatio, is obtaied by dividig the mea deviatio by the average used i the calculatio of deviatios i.e. the arithmetic mea. Thus Co-efficiet of M.D: Sometimes, the mea deviatio is computed by averagig the absolute deviatios of the datavalues from the media i.e. M. D. Mea x x~ Mea deviatio Ad whe will we have a situatio whe we will be usig the media istead of the mea?as discussed earlier, the media will be more appropriate tha the mea i those cases where our data-set cotais a few very high or very low values.i such a situatio, the coefficiet of mea deviatio is give by: Co-efficiet of M.D: M.D. Media Let us ow cosider the stadard deviatio --- that statistic which is the most importat ad the most widely used measure of dispersio. The poit that made earlier that from the mathematical poit of view, it is ot very preferable to take the absolute values of the deviatios, This problem is overcome by computig the stadard deviatio. I order to compute the stadard deviatio, rather tha takig the absolute values of the deviatios, we square the deviatios. Averagig these squared deviatios, we obtai a statistic that is kow as the variace.

Variace ( x x) Let us compute this quatity for the data of the above example. Our -values were: 4 6 0 3 5 8 Takig the deviatios of the -values from their mea, ad the squarig these deviatios, we obtai: ( x x ) ( x x ) 4 0 0 6 + 4 4 0 4 16 3 1 1 5 + 1 1 8 + 4 16 4 Obviously, both ( ) ad () equal 4, both ( 4) ad (4) equal 16, ad both ( 1) ad (1) 1.

Hece (x x) 4 is ow positive, ad this positive value has bee achieved without bedig the rules of mathematics. Averagig these squared deviatios, the variace is give by: Variace: ( x x) 4 6 7 The variace is frequetly employed i statistical work, but it should be oted that the figure achieved is i squared uits of measuremet. I the example that we have just cosidered, the variace has come out to be 6 squared fatalities, which does ot seem to make much sese! I order to obtai a aswer which is i the origial uit of measuremet, we take the positive square root of the variace. The result is kow as the stadard deviatio. STANDARD DEVIATION: S ( x x ) Hece, i this example, our stadard deviatio has come out to be.45 fatalities. I computig the stadard deviatio (or variace) it ca be tedious to first ascertai the arithmetic mea of a series, the subtract it from each value of the variable i the series, ad fially to square each deviatio ad the sum. It is very much more straight-forward to use the short cut formula give below: SHORT CUT FORMULA FOR THE STANDARD DEVIATION: S x x I order to apply the short cut formula, we require oly the aggregate of the series ( x) ad the aggregate of the squares of the idividual values i the series ( x). I other words, oly two colums of figures are called for. The umber of idividual calculatios is also cosiderably reduced, as see below:

4 16 6 36 4 0 0 3 9 5 5 8 64 Total 8 154 Therefore S 154 8 7 7 ( 16) 6.45 fatalities The formulae that we have just discussed are valid i case of raw data. I case of grouped data i.e. a frequecy distributio, each squared deviatio roud the mea must be multiplied by the appropriate frequecy figure i.e. STANDARD DEVIATION IN CASE OF GROUPED DATA: S f ( x x ) Ad the short cut formula i case of a frequecy distributio is: SHORT CUT FORMULA OF THE STANDARD DEVIATION IN CASE OF GROUPED DATA: fx fx S Which is agai preferred from the computatioal stadpoit? For example, the stadard deviatio life of a batch of electric light bulbs would be calculated as follows: EAMPLE: Life (i Hudreds of Hours) No. of Bulbs f Midpoit x fx fx 0 5 4.5 10.0 5.0 5 10 9 7.5 67.5 506.5 10 0 38 15.0 570.0 8550.0 0 40 33 30.0 990.0 9700.0 40 ad over 16 50.0 800.0 40000.0

Therefore, stadard deviatio: S 78781.5 437.5 100 100 13.9hudredhours 1390 hours As far as the graphical represetatio of the stadard deviatio is cocered, a horizotal lie segmet is draw below the -axis o the graph of the frequecy distributio --- just as i the case of the mea deviatio. f Stadard deviatio The stadard deviatio is a absolute measure of dispersio. Its relative measure called coefficiet of stadard deviatio is defied as: Coefficiet of S.D: Sta dard Deviatio Mea

Ad, multiplyig this quatity by 100, we obtai a very importat ad well-kow measure called the coefficiet of variatio. Coefficiet of Variatio: S C.V. 100 As metioed earlier, the stadard deviatio is expressed i absolute terms ad is give i the same uit of measuremet as the variable itself. There are occasios, however, whe this absolute measure of dispersio is iadequate ad a relative form becomes preferable. For example, if a compariso betwee the variability of distributios with differet variables is required, or whe we eed to compare the dispersio of distributios with the same variable but with very differet arithmetic meas. To illustrate the usefuless of the coefficiet of variatio, let us cosider the followig two examples: EAMPLE-1 Suppose that, i a particular year, the mea weekly earigs of skilled factory workers i oe particular coutry were $ 19.50 with a stadard deviatio of $ 4, while for its eighborig coutry the figures were Rs. 75 ad Rs. 8 respectively. From these figures, it is ot immediately apparet which coutry has the GREATER VARIABILITY i earigs. The coefficiet of variatio quickly provides the aswer: COEFFICIENT OF VARIATION For coutry No. 1: 4 100 0.5 per cet, 19.5 Ad for coutry No. : 8 100 37.3 per cet. 75 From these calculatios, it is immediately obvious that the spread of earigs i coutry No. is greater tha that i coutry No. 1, ad the reasos for this could the be sought. EAMPLE-: The crop yield from 0 acre plots of wheat-lad cultivated by ordiary methods averages 35 bushels with a stadard deviatio of 10 bushels. The yield from similar lad treated with a ew fertilizer averages 58 bushels, also with a stadard deviatio of 10 bushels. At first glace, the yield variability may seem to be the same, but i fact it has improved (i.e. decreased) i view of the higher average to which it relates. Agai, the coefficiet of variatio shows this very clearly: Coefficiet of Variatio:

Utreated lad: 10 100 8.57 per cet 35 Treated lad: 10 100 17.4 per cet 58 The coefficiet of variatio for the utreated lad has come out to be 8.57 percet, whereas the coefficiet of variatio for the treated lad is oly 17.4 percet.