TOPIC 6 MEASURES OF VARIATION

Size: px
Start display at page:

Download "TOPIC 6 MEASURES OF VARIATION"

Transcription

1 TOPIC 6 MEASURES OF VARIATIO If people s eyes ted to blak out tables if figures, you ca be dar sure that they blak out the small writig that goes aroud them. Ala Graham, 1994 The cocept of variatio sometimes averages are t eough A measure of the average value ca provide a lot of useful iformatio about a set of observatios, but i may cases it is ot sufficiet to tell us everythig about the variable. Cosider, for example, Figure 6.1 below: Figure 6.1 Compariso of Two Distributios Populatio A Populatio B the distributios are differet yet give the same averages While the two distributios show have the same average values, whether measured as a mea, a media, or a mode, we could ot say that the distributios were the same. To describe ad compare them we eed additioal iformatio; we eed alterative ways of describig the distributios. After the average value, the ext most importat property of the distributio that we eed to measure is the variability of the distributio. From Figure 6.1 we ca see that distributio B is much more variable (or spread out) tha distributio A. I this sectio we shall look at differet ways of measurig variability. actual level of variability We wat to measure variability for two mai reasos. Firstly we may be iterested i the actual level of variability ad i comparig this with aother distributio. If we are lookig at icome distributios, for example, the the govermet may be iterested ot oly i the average icome level, but also i the variability of icome level betwee people ad also betwee differet regios of a coutry. May policies are desiged to help redistribute icome from the richest to the poorest (thereby reducig the Data Aalysis Course Topic 6-15

2 Relative frequecy Measures of variatio variability of icome levels), ad so we would eed to measure variability to see if it chages over time. variability due to sample variatio The secod reaso for watig to measure variability is whe we use samplig to compare populatio values. We the eed to take variability ito accout. We wat to be able to distiguish betwee differeces that might have just happeed by chace (that is, i the selectio of the samples) ad those that idicate some real chage. example Let us look at a example where we are comparig two populatio distributios. Figure 6. Compariso of Icome Levels of Two Populatios Populatio A Populatio B Icome variability is ot ecessarily reflected i averages Populatio A represets the distributio of aual icome per household i oe regio ad Populatio B represets the distributio of aual icome of households i aother regio. Both have the same mea level of icome of $1,800 per year, but we caot say that the two distributios are the same. The distributio of icome of Populatio B is far more spread out tha Populatio A. It also has, therefore, a greater degree of variability. two differet measures of variability It is clear that we should ot oly compare measures of locatio whe lookig at populatios, but also measures of variability. I this topic we shall cosider two differet measures of variability which are basically of two types: a. measures of the distace betwee represetative values of the populatio; ad b. measures of the distace of every uit of the populatio from some specified cetral value. rage ad stadard deviatio for ugrouped data As examples of these measures of variability we shall look i this topic at the rage ad the stadard deviatio (or variace) for ugrouped data. More complicated techiques (such as fidig the stadard deviatio whe the observatios are grouped i a frequecy distributio) are covered i advaced traiig. The rage largest smallest The simplest way to measure the variatio or spread, give a set of observatios, is to calculate the rage. The rage of a set of observatios is defied to be the differece betwee the smallest ad the largest values i the set. This is very simple to uderstad ad easy to calculate ad so has a obvious appeal. It is used i practice, but is oly really useful whe the variable uder cosideratio has a fairly eve spread of values over the rage. It has some obvious drawbacks which ted to restrict its use i Topic 6-16 Secretariat of the Pacific Commuity

3 practice; some of the more importat disadvatages are: disadvatages a. because the rage is the differece betwee the largest ad the smallest value, it is very sesitive to very large or very small observatios. The iclusio of just oe freak (that is, rare or uusual) value will greatly affect the rage; b. the rage is depedet o the umber of observatios. Icreasig the umber of observatios ca oly icrease the rage; it ca ever make it less. This meas that it is difficult to compare rages for two distributios with differet umbers of observatios; c. while the rage is very easy to calculate, it has the disadvatage that it igores all the data i betwee the highest ad the lowest values. If, for example, we cosider the followig three sets of data: Set 1 3, 5, 7, 9, 11, 13, 15, 17, 17, 17, 17 Set 3, 5, 5, 5, 17, 17, 17, 17, 17, 17, 17 Set 3 3, 6, 7, 8, 10, 11, 14, 14, 15, 16, 17 we see that the rage for all three sets are the same (17-3 = 14), but the degree of variatio is by o meas the same; d. it is difficult to calculate the rage for data grouped i a frequecy distributio. All we ca really do is take the differece betwee the lower limit of the first class ad the upper limit of the last class. This would obviously deped o the defiitios of the classes, ad is impossible if you have a ope-eded class. However, some judgmets ca be made depedig o the kowledge of the subject matter uder observatio. For practical purposes the ope-eded classes are usually closed by guessig a value for the ope-ed. example Let us cosider aother example, the values of imports i various Pacific islad coutries i Table 6.1 Total imports by coutry, 1995 (i thousad AUD) Coutry Value of imports (A$ 000) Cook Islads 65,363 Fiji 1,17,05 Kiribati 47,547 Marshall Islads 100,073 Papua ew Guiea 1,741,935 Samoa 16,689 Solomo Islads 4,54 Toga 98,047 Tuvalu 1,535 Vauatu 14,51 Source SPESS 14, 1998, Pacific Commuity, oumea method The rage of the import values is the differece betwee the largest ad the smallest value ad i this case the rage is: Data Aalysis Course Topic 6-17

4 Rage = $ (1,741,935,000 1,535,000) = $1,79 millio do ot usually calculate rage for grouped data The rage from a grouped frequecy distributio is ot usually calculated because of the reasos give i the sectio o disadvatages of the rage. However, it ca be obtaied approximately by takig the differece betwee the upper limit of the last class ad the lower limit of the first class. We must ote that it ca sometimes be very difficult ad at times meaigless if either or both of these classes are ope-eded. Let us cosider oce agai the example of aual household cash icome i two regios of a coutry, which are give i the followig frequecy distributios: Icome ($) Table 6. Compariso of the rage of icome of two regios Regio A Aual Household Cash Icome Frequecy (o. of Households) Icome ($) Regio B Frequecy (o. of Households) Less tha 500 (00*) 137 Less tha 1,000 (500*) ,000-1, ,000-1, ,000 -, ,500-1, ,000-3,999 47,000-4, ,000-6, ,000-9, ,000-9, ,000-19, ,000 & over (0,000*) 88 0,000 & over (30,000*) 14 Total 1, Source: Table 5.1 (illustrative data oly) * = Assumed limits ope-eded class itervals Obviously we caot calculate the rage of icome i such cases because of the presece of opeeded class itervals at both eds. However, if we do have to calculate icome rages for the two populatios, we will be forced to make some assumptios. These assumptios may be well-fouded or ill-fouded, but evertheless, if a decisio has to be made, we will have to put some values i the opeeded classes. I the example above, the assumed values are: Assumed Values ($) Regio A B Lower limit (first class) Upper limit (last class) 30,000 0,000 rage The icome rages for the distributios may ow be calculated as follows: Regio A Regio B Icome Rage = $(30,000-00) = $9,800 Icome Rage = $(0, ) = $19,500 clearly state assumptios I the above example, although we have derived the icome rages as $9,800 for regio A ad $19,500 for regio B the rages could be meaigless if it was later realised that the assumed values were icorrect. However, statisticias ad plaers are ofte cofroted with such problems i their Topic 6-18 Secretariat of the Pacific Commuity

5 everyday work ad decisios such as those take i the case of the rages of icome i regios A ad B are the types of decisios which they have to live with. The importat thig is that the assumptios applied to geerate a result are clearly stated. use poits other tha the highest ad lowest We ca get aroud most of the problems of the rage as a measure of the variatio by usig other poits i the distributio rather tha the two extreme poits. Aother choice would be to measure what we call the quartile deviatio or the semi iter-quartile rage (that is, to measure the mea average differece betwee the upper ad lower quartiles). For a discussio of upper ad lower quartiles refer to Topic 5, More o measures of locatio. The quartile deviatio is ot icluded i these otes, but covered i the advaced aalysis course. use percetiles Aother alterative is to use the differece betwee, say, the 10 th ad the 90 th percetile (that is, those values for which 10 per cet ad 90 per cet of the observed values are below). As measures of variatio, both of these are quite useful. They are ot affected by ay oe or two extreme or rare observatios, they are less depedet o the umber of observatios, ad they will ted to differetiate betwee differet sets of observatios. I the case of ugrouped frequecy distributios, we ca early always calculate these values. I the case of grouped frequecy distributios, a problem occurs whe oe of the percetile or quartile values falls i a ope-eded class. Stadard deviatio stadard deviatio as a measure of spread Although the rage is a simple measure of variatio or spread, it has may disadvatages. We therefore eed a measure which will overcome these disadvatages while still providig a good measure of variatio. Oe method is the mea deviatio where we measure the distace of observatios from the mea. However, the mea deviatio icorporates absolute values ad these are difficult to deal with mathematically. The stadard deviatio is based o the same priciples as the mea deviatio, but i this case we elimiate the sigs of the deviatios from the mea by squarig them. method How does the stadard deviatios work? Like the mea, the stadard deviatio takes all the observed values ito accout. If there were o dispersio at all i a distributio, all the observed values would be the same. The mea would also be the same as this repeated value. So if everyoe had the same height of 180cm, the mea would be 180cm. o observed value would deviate or differ from the mea. But, with dispersio, the observed values do deviate from the mea, some by a lot, some by oly a little. Quotig the stadard deviatio of a distributio is a way of idicatig a kid of average amout by which all the values deviate from the mea. The greater the dispersio, the bigger the deviatios ad the bigger the stadard deviatio. priciple of stadard deviatio The stadard deviatio is foud by addig the squares of the deviatios of the idividual values from the mea of the distributio, dividig this sum by the umber of items i the distributio, ad the fidig the square root of this umber. Lets ow explai the procedure for calculatig the stadard deviatio i more detail. I terms of a populatio cosistig of values x 1, x, x 3... x with a mea (proouced mu or mew) the stadard deviatio of a populatio is defied as: Data Aalysis Course Topic 6-19

6 Formula Stadard Deviatio ( ) = i1 ( x i Defiitio To describe the formula we will work through the steps to calculate the stadard deviatio. First we calculate : is calculated the same way as x i the previous chapter (i.e. we add up all the umbers ad divide by how may umbers there were). We call it whe we are dealig with a populatio, rather tha x whe it is a sample. We subtract from each x value: (x i - ) Square each of these values: (x i - ) Sum these values to get the total: ( x ) Divide by the umber of uits i the populatio (): Take the square root of everythig: i1 (x i1 i (x i The stadard deviatio of a populatio is deoted by (the Greek letter for small sigma). variace The square of the stadard deviatio is called the variace ad is deoted by. Whe we square the result of a formula which has a square root, the square root sig is cacelled out ad disappears. We the have: formula Variace ( ) = i1 (x i sample variace If we are dealig with a sample ad wish to calculate the sample variace (or sample stadard deviatio) i order to estimate the value for the populatio, the formula is chaged slightly. I this case s stads for the sample variace, x the sample mea, ad the sample size. The formula for the sample variace is the: Sample Variace (s ) = i1 ( x i x) ( 1) Topic Secretariat of the Pacific Commuity

7 ad the sample stadard deviatio is give by: Sample Stadard Deviatio (s) = i1 ( x i x) 1 sample = (-1) These formulae for samples are effectively the same as those for populatios, except that we have used the divisio ( - 1) istead of. The importat thig to remember is that whe calculatig the variace or stadard deviatio of a sample, divide by ( - 1). Whe calculatig the variace or stadard deviatio of a populatio, divide by. large stadard deviatio = large spread ote that the more the values of idividual items differ from the mea, the greater will be the square of these differeces ad therefore the greater the sum of squares. Therefore, the greater the sum of squares, the larger will s (the stadard deviatio) be. Hece, the greater the dispersio, the larger the stadard deviatio will be. example We will ow go through the calculatio of the stadard deviatio usig the followig data. Table 6.3: 000 Secodary School Erolmet by Provice, PG Deviatio Deviatios Provice Erolmets from mea squared Wester 961 -,470 6,100,900 Gulf 1,53-1,908 3,640,464 CD 4,854 1,43,04,99 Cetral 3, ,569 Oro 3, ,09 SHP 1,68-1,749 3,059,001 EHP 5,768,337 5,461,569 Simbu 6,18,751 7,568,001 Mea = 3, ,950,64 Source: Illustrative data oly first fid the mea To calculate the stadard deviatio we first calculate x. x = i1 x i = ( ,53 + 4, , , ,68 + 5, ,18)/8 = 3,431 I Colum 3 we subtract the mea value from the values for each year. I Colum 4 we square the deviatios ad sum these squared deviatios, givig a total of 7,950,64. Data Aalysis Course Topic 6-131

8 data from a populatio If the above data are cosidered to be from a populatio, the to derive the stadard deviatio we divide the sum of the squared deviatios by the umber of the observatios ( = 8) ad take the square root. I this case we have: Populatio Stadard Deviatio () = 7,950,64 8 = 3,493, = 1, data from a sample However, if the data are cosidered to be a sample from a populatio, the to derive the stadard deviatio we divide the sum of the squared deviatios by oe less tha the umber of the observatios or idustries (-1 = 7) ad take the square root. I this case we have: Sample Stadard Deviatio (s) = 7,950,64 7 = 3,99, = 1,998.4 I this example we would probably cosider the data to be sample data, so would divide by 7. awkward with a large set of umbers Although this is a fairly simple procedure to calculate the stadard deviatio of a small set of umbers, it is quite a cumbersome procedure for a large set of umbers. First of all we have to determie the mea of the set, the calculate the deviatios of each observatio from the mea, square these ad add them up. Eve with the aid of a calculator the operatios take quite a lot of time. It is best to use a computer to perform the calcuatios. rearrage the formula We ca, however, make the calculatio much easier by rearragig the formula for the variace. Thus, for a sample, we have: Sample formula s = i1 ( x i x) 1 = i1 xi xi i1 1 populatio formula = i1 ( x ) i = i1 xi xi i1 steps for sample variace Lets ru through this formula for the sample variace. For the sample variace we first square each idividual x value: x i We the calculate the sum of those squared umbers: x i i1 Call this total A. We also calculate the total of the idividual x values: i1 xi Topic 6-13 Secretariat of the Pacific Commuity

9 We square this total: i1 xi ad divide by (the umber i the sample): xi i1 Call this total B We the take A - B ad divide by (-1): i1 xi xi i1 1 ot as complicated as it looks Although the secod formula looks more complicated, it is i fact much easier to use whe we are usig a calculator. For example, let us cosider the followig sample values which are the same observatios that we had cosidered i Table 6.3. example 961 1,53 4,854 3,344 3,134 1,68 5,768 6,18 total ad the mea of the observatios a. Calculatig the variace of the sample the first way would etail firstly obtaiig the total ad the mea of the observatios. We have: x i = 7,448, = 8 x = 3,431 secod method The deviatios from the mea are: -,470-1,908 1, ,749,337,751 The sum of the squares of the deviatio is 7,950,64. Thus, the variace is: s = 7,950,64 / 7 = 3,99, b. Calculatig the variace usig the secod method or formula we eed: x i = 7,448 ad x i = 1,14,730 s = i1 xi xi i1 1 = [1,14,730 - {(7,448) / 8}] / 7 = (1,14,730 94,174,088) / 7 s = 3,99, secod method is easier ad faster Thus we see that if we use the memory fuctio i a calculator, the secod calculatio ca be doe without havig to write ay itermediate results. You will also ote that the variace derived usig either of the two methods is the same (3,99,948.86) except that the secod method is easier ad faster. Data Aalysis Course Topic 6-133

10 Properties of the stadard deviatio remember Whe usig the stadard deviatio it is importat to remember the followig poits: the stadard deviatio is used oly to measure the spread about the mea; the stadard deviatio is ever egative; the stadard deviatio is sesitive to extreme values (called outliers). A sigle outlier ca raise the stadard deviatio a great deal, distortig the picture of spread; ad the greater the spread, the greater the stadard deviatio. Coefficiet of variatio the mea adds meaig to the stadard deviatio The stadard deviatio by itself is ot very meaigful uless it is cosidered alog with the arithmetic mea. For example, a stadard deviatio of $100 whe the mea icome is $10,000 implies a much greater relative variatio tha a stadard deviatio of $100 for a mea GDP figure of $10,000,000. Also, comparig the variability of two populatios with differet uits of measuremet (for example, icome levels i Papua ew Guiea (Kia) ad Vauatu (Vatu) ca be very difficult. iterested i variatio from the mea Hece, the variability i a set of observatios ca usefully be measured relative to a cetral measure such as the arithmetic mea. Such a measure is provided by the coefficiet of variatio, which is the ratio of the stadard deviatio to the arithmetic mea, usually expressed as a percetage, ad is give by the formula: formula Coefficiet of Variatio (C.V.) = ( / x ) 100 (The 100 coverts the umber to a percetage.) ca compare data To compare the variability of two sets of figures would therefore ivolve comparig their respective coefficiets of variatio. The coefficiet of variatio allows for comparisos whe: o the meas of the distributios beig compared are far apart, or o the data are i differet uits. percetage The uits are coverted to a commo deomiator (a percet). example If we look at the data i Table 6.3, we ca calculate the coefficiet of variatio as: C.V. = ( / x ) * 100 = 1, / 3,431 * 100 = 54.48% illustrative example Let s use some made up data to illustrate the coefficiet of variatio. The mea icome of homeowers i Australia is $40,000 with a stadard deviatio of $4,000. I Topic Secretariat of the Pacific Commuity

11 Kiribati, the mea icome of home owers is $1,000 with a stadard deviatio of $1,00. (ote that the meas are far apart ad the stadard deviatios are differet. Compare ad iterpret the relative dispersio i the two groups o icomes. solutio The first impulse is to say that there is more dispersio i the icomes i Australia because the stadard deviatio is greater. However, whe we covert the two measuremets to relative terms usig the coefficiet of variatio, we fid that the relative dispersio is the same. Australia Kiribati CV(Australia) = ( / x ) * 100 = $4,000/$40,000 * 100 = 10% CV(Kiribati) = ( / x ) * 100 = $1,00/$1,000 * 100 = 10% similar CV I summary the icome for both Australia ad Kiribati have similar amout of variatio. example We could also compare two differet types of data icomes ad age of homeowers. We could compare the spread of icomes of homeowers i Australia with say the spread of the age of homeowers. The mea age of homeowers is 40 years with a stadard deviatio of 10 years. age CV(age) = ( / x ) * 100 = (10/40) * 100 = 5% CV(icome) = 10% We ca see that there is greater relative dispersio i the ages of the homeowers tha i their icomes. ormal distributio used extesively A particular distributio that is used extesively i statistical theory is the ormal distributio: Data Aalysis Course Topic 6-135

12 properties The ormal distributio has several key properties. o o o it is symmetrical; it is bell shaped; mea of the distributio is the peak; ad o the area uder the curve is always 1. always have the four ormal distributios ca have differet meas ad stadard deviatios, but they always have these four key properties. everyday examples May pheomea i every day life ca be described by the ormal curve, for example people s height. A small umber of people i the populatio are very short, a small umber are very tall, ad the majority of the populatio fall i some middle rage. May other pheomea are also ormally distributed, for example test scores ad weights of people. We could discuss the ormal distributio extesively, but for ow that is all you eed to kow. Referece Rages for a Stadard Deviatio aalysis of data ormally distributed Whe aalysig ormally distributed data, the stadard deviatio is used with the mea to calculate where the data lie withi certai referece rages. The most importat thig to uderstad about referece rages is that for ay set of ormally distributed data: referece rages about 68% of the data lie i the iterval x - s < x < x + s (That is, 68% of the data lie i the rage from the mea mius the stadard deviatio to the mea plus the stadard deviatio) about 95% of the data lie i the iterval x - s < x < x + s about 99% of the data lie i the iterval x - 3s < x < x + 3s where x = the mea; ad s = the stadard deviatio 68% referece rage If we look at the data i Table 6.3, we ca calculate the 68% referece rage for the data as: 68% Referece rage: ( x - s, x + s) ( , ) (143.76, 549.4) That is, 68 % of the data lies i the rage 1,43.76 to 5, % referece rage We ca calculate the 95% referece rage as: 95% Referece rage: ( x - s, x + s) ( (1998.4), (1998.4)) Topic Secretariat of the Pacific Commuity

13 ( , ) ( , ) That is, 95 % of the data lies i the rage to 7, Summary of the measures of variability RAGE is easily calculated, except for frequecy distributios, ad is well uderstood; is based o the two extreme observatios ad is thus very ustable; is difficult to maipulate mathematically; provides o iformatio about the geeral behaviour of the distributio; should oly be used as a rough guide to the level of variability. VARIACE/STADARD DEVIATIO is a measure of variability usig iformatio from every observatio; with some maipulatio, the calculatios are reasoably straight-forward; has a cetral role i mathematical ad statistical theory ad is very widely used; ca be affected by extreme values; is the most commoly used measure of variability. COEFFICIET OF VARIATIO is idepedet of the uits of observatios. Therefore, it is useful i comparig distributios where the uits of observatios are differet; a disadvatage of the coefficiet of variatio is that it is ustable whe the arithmetic mea is close to zero. Data Aalysis Course Topic 6-137

14 Oe fial characteristic of a distributio uderstad the uderlyig structure The objective of summarisig a set of data is to make it possible to comprehed the uderlyig structure ad patter of the distributio of the values of the variable uder cosideratio. The attempt i summarisig the data is to reduce them to a few measures which would give us a idicatio of the cetral values, variatio of the values, cocetratio of the frequecies ad shape of the distributio. The frequecy distributio describes the populatio we are cosiderig, ad the measures of locatio ad variatio help us to characterise the distributio by simple measures. skewed distributios asymmetrical Aother way of characterisig a distributio is to study its skewess (that is, whether the distributio is ot symmetrical ad, if ot, whether the observatios are cocetrated i the low or high values). Examples of skewed distributios are icome, lad holdig size ad household size. For such distributios, oe is iterested i fidig out the type of skewess, whether there are more uits with low values tha uits with high values, or whether there are more uits with high values tha uits with low values. 'right tail A distributio is said to be positively skewed if large frequecy values are cocetrated to the left of the distributio ad the distributio has small frequecy values to the right of the distributio (that is, the distributio has a right tail ad has more low values tha high values). left tail A distributio is said to be egatively skewed if large frequecy values are cocetrated to the right of the distributio ad the distributio has small frequecy values to the left of the distributio (that is, the distributio has a left tail ad has more high values tha low values). three mai features A distributio ca be cosidered to have three mai features which are of iterest i studyig a populatio. These features are: 1 its cetral values; its variatio from the cetral values; 3 whether the distributio is symmetric about the cetral values; ad if ot symmetric, whether it is leaig to the left or right. Topic Secretariat of the Pacific Commuity

15 Exercises 1. The local bus compay employs 10 people. The legth of service, i completed years, for each employee is as follows: (a) (b) (c) (d) Calculate the rage. Calculate the stadard deviatio (assume the values are sample values). Calculate the coefficiet of variatio. Calculate the referece rage which cotais approximately 68% of observatios.. Customs files reveal the ages of persos leavig the coutry. A sample of ages are: 16, 41, 5, 1, 30, 17, 9, 50, 30 ad 39. (a) (b) (c) (d) Calculate the rage. Assume the values are sample values ad calculate the sample variace usig the secod method of calculatig the variace. Calculate the coefficiet of variatio. Calculate the referece rage that cotais approximately 95% of observatios. Data Aalysis Course Topic 6-139

16 3. The local market reported the followig umber of people buyig vegetables for the past 9 days: (a) (b) (c) (d) Calculate the rage. Calculate the stadard deviatio (assume the values are sample values). Calculate the coefficiet of variatio. Calculate the referece rage that cotais approximately 95% of observatios. Topic Secretariat of the Pacific Commuity

17 Self-Review 1. The followig data represet the amout spet (i dollars) by a radom sample of 14 households o basic food items for oe moth: (a) (b) (c) (d) Calculate the rage. Calculate the sample stadard deviatio. Calculate the coefficiet of variatio. Calculate the referece rage that cotais approximately 99% of observatios. Data Aalysis Course Topic 6-141

18 Topic 6-14 Secretariat of the Pacific Commuity

19 Excel fuctios More statistical fuctios I Topic 5, you were show how to use the fuctios related to Measures of Locatio. I this sectio, those relevat to Measures of Variatio are illustrated. You do t have to use the fuctios istead you ca set up a worksheet with the three colums (observatio, deviatio from the mea ad deviatios squared). See the computer otes for Topic 7 to set up the worksheet to calculate the variace, stadard deviatio ad stadard error from sample data. You have to be careful because the way your sample was selected determies how the stadard error is calculated. If you have ay doubts about the correct formula to use, cotact the SPC Statistics Programme for help. Whe calculatig the variace or stadard deviatio, it might be more useful to use the worksheet method rather tha the Excel fuctio. If you have the colums set up i your worksheet you ca see the differet compoets of the equatio (x etc), ad it would be easier to fid out why you had a larger or smaller tha expected deviatio i your data. You also have to be aware that Excel uses its average fuctio which icludes 0 values i the cout of observatios () which might ot be appropriate i all circumstaces. The rage You do t really eed to use a fuctio to calculate the rage use the sort buttos o the Stadard toolbar. You ca sort from smallest to largest with the butto, ad from largest to smallest with the butto. Be careful whe you sort data either select ALL your data, or click with the mouse i the colum you wat to sort by: it is very easy to corrupt your data with the sort buttos (you do t get a warig like you do with the sort optio o the Data meu). Populatio variace Excel calculates the variace for a POPULATIO usig the formula: which is a differet way of writig the oe used i your otes. Format: Exampl e = varp(cell rage) =varp(a1:a333) will calculate the variace for the POPULATIO i cells A1 to cell A333. Sample variace Excel calculates the variace for a SAMPLE usig the formula: which agai is a differet way of writig the oe used i your otes. Format: = var(cell rage) Example =var(a1:a333) will calculate the variace for the SAMPLE i cells A1 to cell A333. Data Aalysis Course Topic 6-143

20 Populatio stadard deviatio Excel calculates the stadard deviatio for a POPULATIO usig the formula: which is a differet way of writig the oe used i your otes. Format: Example = stdevp(cell rage) = stdevp(a1:a333) will calculate the stadard deviatio for the POPULATIO i cells A1 to cell A333. Sample stadard deviatio Excel calculates the stadard deviatio for a SAMPLE usig the formula: which agai is a differet way of writig the oe used i your otes. Format: Example = stdev(cell rage) =stdev(a1:a333) will calculate the stadard deviatio for the SAMPLE i cells A1 to cell A333. Cofidece iterval You ca use Excel to calculate the cofidece iterval for a mea. You have to type i the stadard deviatio so the fuctio is ot that user friedly. Format: Example = cofidece(alpha,stadard_dev,size) Where alpha is the sigificace level used to compute the cofidece level. The cofidece level equals 100*(1 - alpha)%, or i other words, a alpha of 0.05 idicates a 95 percet cofidece level. Stadard_dev is the populatio stadard deviatio for the data rage ad is assumed to be kow. Size is the sample size. Suppose we observe that, i a sample of 50 commuters, the average legth of travel to work is 30 miutes with a populatio stadard deviatio of.5. We ca calculate with 95% cofidece that the populatio mea is i the iterval: =COFIDECE(0.05,.5,50) equals I other words, the average legth of travel to work equals 30 ± miutes, or 9.3 to 30.7 miutes. Topic Secretariat of the Pacific Commuity

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Measures of Spread: Standard Deviation

Measures of Spread: Standard Deviation Measures of Spread: Stadard Deviatio So far i our study of umerical measures used to describe data sets, we have focused o the mea ad the media. These measures of ceter tell us the most typical value of

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Median and IQR The median is the value which divides the ordered data values in half.

Median and IQR The median is the value which divides the ordered data values in half. STA 666 Fall 2007 Web-based Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5-umber summary mea ad stadard deviatio Media

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements. CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio.

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Lecture 24 Floods and flood frequency

Lecture 24 Floods and flood frequency Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Estimation of a population proportion March 23,

Estimation of a population proportion March 23, 1 Social Studies 201 Notes for March 23, 2005 Estimatio of a populatio proportio Sectio 8.5, p. 521. For the most part, we have dealt with meas ad stadard deviatios this semester. This sectio of the otes

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

AP Statistics Review Ch. 8

AP Statistics Review Ch. 8 AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Data Description. Measure of Central Tendency. Data Description. Chapter x i Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

1 Lesson 6: Measure of Variation

1 Lesson 6: Measure of Variation 1 Lesso 6: Measure of Variatio 1.1 The rage As we have see, there are several viable coteders for the best measure of the cetral tedecy of data. The mea, the mode ad the media each have certai advatages

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying. Lecture Mai Topics: Defiitios: Statistics, Populatio, Sample, Radom Sample, Statistical Iferece Type of Data Scales of Measuremet Describig Data with Numbers Describig Data Graphically. Defiitios. Example

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all! ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Measures of Spread: Variance and Standard Deviation

Measures of Spread: Variance and Standard Deviation Lesso 1-6 Measures of Spread: Variace ad Stadard Deviatio BIG IDEA Variace ad stadard deviatio deped o the mea of a set of umbers. Calculatig these measures of spread depeds o whether the set is a sample

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6) STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated

More information

Analysis of Experimental Data

Analysis of Experimental Data Aalysis of Experimetal Data 6544597.0479 ± 0.000005 g Quatitative Ucertaity Accuracy vs. Precisio Whe we make a measuremet i the laboratory, we eed to kow how good it is. We wat our measuremets to be both

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS. INTRODUCTION We have so far discussed three measures of cetral tedecy, viz. The Arithmetic Mea, Media

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

(6) Fundamental Sampling Distribution and Data Discription

(6) Fundamental Sampling Distribution and Data Discription 34 Stat Lecture Notes (6) Fudametal Samplig Distributio ad Data Discriptio ( Book*: Chapter 8,pg5) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye 8.1 Radom Samplig: Populatio:

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Confidence Intervals for the Population Proportion p

Confidence Intervals for the Population Proportion p Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

More information

SNAP Centre Workshop. Basic Algebraic Manipulation

SNAP Centre Workshop. Basic Algebraic Manipulation SNAP Cetre Workshop Basic Algebraic Maipulatio 8 Simplifyig Algebraic Expressios Whe a expressio is writte i the most compact maer possible, it is cosidered to be simplified. Not Simplified: x(x + 4x)

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Measures of Variation

Measures of Variation Chapter : Measures of Variatio from Statistical Aalysis i the Behavioral Scieces by James Raymodo Secod Editio 97814669676 01 Copyright Property of Kedall Hut Publishig CHAPTER Measures of Variatio Key

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

HUMBEHV 3HB3 Measures of Central Tendency & Variability Week 2

HUMBEHV 3HB3 Measures of Central Tendency & Variability Week 2 Describig Data Distributios HUMBEHV 3HB3 Measures of Cetral Tedecy & Variability Week 2 Prof. Patrick Beett Ofte we wish to summarize distributios of data, rather tha showig histograms Two basic descriptios

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

Chapter 6. Sampling and Estimation

Chapter 6. Sampling and Estimation Samplig ad Estimatio - 34 Chapter 6. Samplig ad Estimatio 6.. Itroductio Frequetly the egieer is uable to completely characterize the etire populatio. She/he must be satisfied with examiig some subset

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Census. Mean. µ = x 1 + x x n n

Census. Mean. µ = x 1 + x x n n MATH 183 Basic Statistics Dr. Neal, WKU Let! be a populatio uder cosideratio ad let X be a specific measuremet that we are aalyzig. For example,! = All U.S. households ad X = Number of childre (uder age

More information

Activity 3: Length Measurements with the Four-Sided Meter Stick

Activity 3: Length Measurements with the Four-Sided Meter Stick Activity 3: Legth Measuremets with the Four-Sided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a four-sided meter

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Topic 10: Introduction to Estimation

Topic 10: Introduction to Estimation Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i : Error Error & Ucertaity The error is the differece betwee a TRUE value,, ad a MEASURED value, i : E = i There is o error-free measuremet. The sigificace of a measuremet caot be judged uless the associate

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab) Cofidece Itervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Aa Phophoem, Ph.D. aa.p@ku.ac.th Itelliget Wireless Network Group (IWING Lab) http://iwig.cpe.ku.ac.th Computer Egieerig Departmet Kasetsart Uiversity,

More information

MA238 Assignment 4 Solutions (part a)

MA238 Assignment 4 Solutions (part a) (i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information