Range The range is the simplest of the three measures and is defined now.

Size: px
Start display at page:

Download "Range The range is the simplest of the three measures and is defined now."

Transcription

1 Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test. Since different chemical agents are added to each group and only six cans are involved, these two groups constitute two small populations. The results (in months) are shown. Find the mean of each group. Brand A Brand B Since the means are equal in the example, you might conclude that both brands of paint last equally well. However, when the data sets are examined graphically, a somewhat different conclusion might be drawn. See the figure below. As the figure below shows, even though the means are the same for both brands, the spread, or variation, is quite different. The figure shows that brand B performs more consistently; it is less variable. For the spread or variability of a data set, three measures are commonly used: range, variance, and standard deviation. Each measure will be discussed in this section. Range The range is the simplest of the three measures and is defined now. The range is the highest value minus the lowest value. The symbol R is used for the range. R highest value lowest value Variation of paint (in months) Examine The Data Sets Graphically A A A A A A 10 (a) Brand A Variation of paint (in months) B B B B B B (b) Brand B

2 Comparison of Outdoor Paint EXAMPLE Find the ranges for the paints. Make sure the range is given as a single number. The range for brand A shows that 50 months separate the largest data value from the smallest data value. For brand B, 20 months separate the largest data value from the smallest data value, which is less than one-half of brand A s range. One extremely high or one extremely low data value can affect the range markedly, as shown in the next example. Employee Salaries EXAMPLE The salaries for the staff of the XYZ Manufacturing Co. are shown here. Find the range. Staff Salary Owner $100,000 Manager 40,000 Sales representative 30,000 Workers 25,000 15,000 18,000 Since the owner s salary is included in the data, the range is a large number. To have a more meaningful statistic to measure the variability, statisticians use measures called the variance and standard deviation. Population Variance and Standard Deviation Before the variance and standard deviation are defined formally, the computational procedure will be shown, since the definition is derived from the procedure. Rounding Rule for the Standard Deviation The rounding rule for the standard deviation is the same as that for the mean. The final answer should be rounded to one more decimal place than that of the original data. Comparison of Outdoor Paint Find the variance and standard deviation for the data set for brand A paint in the paint fading example. 10, 60, 50, 30, 40, 20

3 Solution Step 1 Find the mean for the data. Step 2 Step 3 Step 4 Step 5 Step 6 m X N Subtract the mean from each data value Square each result. ( 25) ( 15) ( 5) 2 25 ( 25) ( 5) 2 25 ( 15) Find the sum of the squares Divide the sum by N to get the variance. Variance Take the square root of the variance to get the standard deviation. Hence, the standard deviation equals 291.7, or It is helpful to make a table. A B C Values X X M (X M) Column A contains the raw data X. Column B contains the differences X m obtained in step 2. Column C contains the squares of the differences obtained in step 3. The preceding computational procedure reveals several things. First, the square root of the variance gives the standard deviation; and vice versa, squaring the standard deviation gives the variance. Second, the variance is actually the average of the square of the distance that each value is from the mean. Therefore, if the values are near the mean, the variance will be small. In contrast, if the values are far from the mean, the variance will be large. You might wonder why the squared distances are used instead of the actual distances. One reason is that the sum of the distances will always be zero. To verify this result for a specific case, add the values in column B of the table above. When each value is squared, the negative signs are eliminated. Finally, why is it necessary to take the square root? The reason is that since the distances were squared, the units of the resultant numbers are the squares of the units of the original raw data. Finding the square root of the variance puts the standard deviation in the same units as the raw data. When you are finding the square root, always use its positive or principal value, since the variance and standard deviation of a data set can never be negative.

4 Section 3 2 Measures of Variation 127 The variance is the average of the squares of the distance each value is from the mean. The symbol for the population variance is s 2 (s is the Greek lowercase letter sigma). The formula for the population variance is X s 2 m 2 N where X individual value m population mean N population size The standard deviation is the square root of the variance. The symbol for the population standard deviation is s. The corresponding formula for the population standard deviation is s s 2 m 2 X N Comparison of Outdoor Paint Find the variance and standard deviation for brand B from the paint data in the first example. The months were 35, 45, 30, 35, 40, 25

5 Since the standard deviation of brand A is 17.1 and the standard deviation of brand B is 6.5, the data are more variable for brand A. In summary, when the means are equal, the larger the variance or standard deviation is, the more variable the data are. Sample Variance and Standard Deviation When computing the variance for a sample, one might expect the following expression to be used: X X 2 n where X is the sample mean and n is the sample size. This formula is not usually used, however, since in most cases the purpose of calculating the statistic is to estimate the corresponding parameter. For example, the sample mean X is used to estimate the population mean m. The expression X X 2 n does not give the best estimate of the population variance because when the population is large and the sample is small (usually less than 30), the variance computed by this formula usually underestimates the population variance. Therefore, instead of dividing by n, find the variance of the sample by dividing by n 1, giving a slightly larger value and an unbiased estimate of the population variance. The formula for the sample variance, denoted by s 2, is X X s 2 2 n 1 where X sample mean n sample size To find the standard deviation of a sample, you must take the square root of the sample variance, which was found by using the preceding formula. Formula for the Sample Standard Deviation The standard deviation of a sample (denoted by s) is X s s 2 2 Xn 1 where X individual value X sample mean n sample size Shortcut formulas for computing the variance and standard deviation are presented next. These formulas are mathematically equivalent to the preceding formulas and do not involve using the mean. They save time when repeated subtracting and squaring occur in the original formulas. They are also more accurate when the mean has been rounded.

6 Shortcut or Computational Formulas for s 2 and s The shortcut formulas for computing the variance and standard deviation for data obtained from samples are as follows. Variance Standard deviation s 2 n X2 X 2 n n 1 s n X 2 X 2 n n 1 European Auto Sales Find the sample variance and standard deviation for the amount of European auto sales for a sample of 6 years shown. The data are in millions of dollars. 11.2, 11.9, 12.0, 12.8, 13.4, 14.3 Note that X 2 is not the same as ( X) 2. The notation X 2 means to square the values first, then sum; ( X) 2 means to sum the values first, then square the sum. Variance and Standard Deviation for a Frequency Distribution The procedure for finding the variance and standard deviation for frequency distribution data is similar to that for finding the mean for frequency distribution data, and it uses the midpoints of each class.

7 Miles Run per Week Find the variance and the standard deviation for the frequency distribution of the data in Example 2 7. The data represent the number of miles that 20 runners ran during one week. Class Frequency Solution Step 1 Make a table as shown, and find the midpoint of each class. A B C D E Frequency Midpoint Class f X m f X m f X 2 m

8 Be sure to use the number found in the sum of column B (i.e., the sum of the frequencies) for n. Do not use the number of classes. The steps for finding the variance and standard deviation for grouped data are summarized in this Procedure Table. Procedure Table Finding the Sample Variance and Standard Deviation for Grouped Data Step 1 Step 2 Step 3 Step 4 Step 5 Make a table as shown, and find the midpoint of each class. A B C D E Class Frequency Midpoint f X m f Multiply the frequency by the midpoint for each class, and place the products in column D. Multiply the frequency by the square of the midpoint, and place the products in column E. Find the sums of columns B, D, and E. (The sum of column B is n. The sum of column D is f X m. The sum of column E is f X 2 m.) Substitute in the formula and solve to get the variance. X 2 m Step 6 s 2 n f X2 m f X m 2 n n 1 Take the square root to get the standard deviation. The three measures of variation are summarized below Summary of Measures of Variation Measure Definition Symbol(s) Range Distance between highest value and lowest value R Variance Average of the squares of the distance that each value is from the mean s 2, s 2 Standard deviation Square root of the variance s, s

9 Uses of the Variance and Standard Deviation 1. As previously stated, variances and standard deviations can be used to determine the spread of the data. If the variance or standard deviation is large, the data are more dispersed. This information is useful in comparing two (or more) data sets to determine which is more (most) variable. 2. The measures of variance and standard deviation are used to determine the consistency of a variable. For example, in the manufacture of fittings, such as nuts and bolts, the variation in the diameters must be small, or the parts will not fit together. 3. The variance and standard deviation are used to determine the number of data values that fall within a specified interval in a distribution. For example, Chebyshev s theorem (explained later) shows that, for any distribution, at least 75% of the data values will fall within 2 standard deviations of the mean. 4. Finally, the variance and standard deviation are used quite often in inferential statistics. These uses will be shown in later chapters of this textbook. Coefficient of Variation Whenever two samples have the same units of measure, the variance and standard deviation for each can be compared directly. For example, suppose an automobile dealer wanted to compare the standard deviation of miles driven for the cars she received as trade-ins on new cars. She found that for a specific year, the standard deviation for Buicks was 422 miles and the standard deviation for Cadillacs was 350 miles. She could say that the variation in mileage was greater in the Buicks. But what if a manager wanted to compare the standard deviations of two different variables, such as the number of sales per salesperson over a 3-month period and the commissions made by these salespeople? A statistic that allows you to compare standard deviations when the units are different, as in this example, is called the coefficient of variation. The coefficient of variation, denoted by CVar, is the standard deviation divided by the mean. The result is expressed as a percentage. For samples, CVar s X 100% For populations, CVar s m 100% Sales of Automobiles EXAMPLE The mean of the number of sales of cars over a 3-month period is 87, and the standard deviation is 5. The mean of the commissions is $5225, and the standard deviation is$773. Compare the variations of the two. Since the coefficient of variation is larger for commissions, the commissions are more variable than the sales.

10 Range Rule of Thumb The range can be used to approximate the standard deviation. The approximation is called the range rule of thumb. The Range Rule of Thumb A rough estimate of the standard deviation is s range 4 In other words, if the range is divided by 4, an approximate value for the standard deviation is obtained. For example, the standard deviation for the data set 5, 8, 8, 9, 10, 12, and 13 is 2.7, and the range is The range rule of thumb is s 2. The range rule of thumb in this case underestimates the standard deviation somewhat; however, it is in the ballpark. A note of caution should be mentioned here. The range rule of thumb is only an approximation and should be used when the distribution of data values is unimodal and roughly symmetric. The range rule of thumb can be used to estimate the largest and smallest "USUAL" data values of a data set. The smallest data value will be approximately 2 standard deviations below the mean, and the largest data value will be approximately 2 standard deviations above the mean of the data set. The mean for the previous data set is 9.3; hence, MINIMUM USUAL DATA VALUE X 2s MAXIMUM USUAL DATA VALUE X 2s Notice that the smallest data value was 5, and the largest data value was 13. Again, these are rough approximations. For many data sets, almost all data values will fall within 2 standard deviations of the mean. Better approximations can be obtained by using Chebyshev s theorem and the empirical rule. These are explained next. Chebyshev s Theorem As stated previously, the variance and standard deviation of a variable can be used to determine the spread, or dispersion, of a variable. That is, the larger the variance or standard deviation, the more the data values are dispersed. For example, if two variables measured in the same units have the same mean, say, 70, and the first variable has a standard deviation of 1.5 while the second variable has a standard deviation of 10, then the data for the second variable will be more spread out than the data for the first variable. Chebyshev s theorem, developed by the Russian mathematician Chebyshev ( ), specifies the proportions of the spread in terms of the standard deviation. Chebyshev s theorem The proportion of values from a data set that will fall within k standard deviations of the mean will be at least 1 1 k 2, where k is a number greater than 1 (k is not necessarily an integer). This theorem states that at least three-fourths, or 75%, of the data values will fall within 2 standard deviations of the mean of the data set. This result is found by substituting k 2 in the expression. 1 1 k or % 4

11 For the example in which variable 1 has a mean of 70 and a standard deviation of 1.5, at least three-fourths, or 75%, of the data values fall between 67 and 73. These values are found by adding 2 standard deviations to the mean and subtracting 2 standard deviations from the mean, as shown: and 70 2(1.5) (1.5) For variable 2, at least three-fourths, or 75%, of the data values fall between 50 and 90. Again, these values are found by adding and subtracting, respectively, 2 standard deviations to and from the mean. and 70 2(10) (10) Furthermore, the theorem states that at least eight-ninths, or 88.89%, of the data values will fall within 3 standard deviations of the mean. This result is found by letting k 3 and substituting in the expression. For variable 1, at least eight-ninths, or 88.89%, of the data values fall between 65.5 and 74.5, since and 1 1 k or % (1.5) (1.5) For variable 2, at least eight-ninths, or 88.89%, of the data values fall between 40 and 100. Chebyshev s Theorem At least 88.89% At least 75% X 3s X 2s X X + 2s X+ 3s This theorem can be applied to any distribution regardless of its shape. The next two examples illustrate the application of Chebyshev s theorem.

12 Prices of Homes The mean price of houses in a certain neighborhood is $50,000, and the standard deviation is $10,000. Find the price range for which at least 75% of the houses will sell. Chebyshev s theorem can be used to approximate the minimum percentage of data values that will fall between any two given values. The procedure is shown in the next example. Travel Allowances A survey of local companies found that the mean amount of travel allowance for executives was $0.25 per mile. The standard deviation was $0.02. Using Chebyshev s theorem, find the minimum percentage of the data values that will fall between $0.20 and $0.30.

13 The Empirical (Normal) Rule Chebyshev s theorem applies to any distribution regardless of its shape. However, when a distribution is bell-shaped (or what is called normal), the following statements, which make up the empirical rule, are true. Approximately 68% of the data values will fall within 1 standard deviation of the mean. Approximately 95% of the data values will fall within 2 standard deviations of the mean. Approximately 99.7% of the data values will fall within 3 standard deviations of the mean. For example, suppose that the scores on a national achievement exam have a mean of 480 and a standard deviation of 90. If these scores are normally distributed, then approximately 68% will fall between 390 and 570 ( and ). Approximately 95% of the scores will fall between 300 and 660 ( and ). Approximately 99.7% will fall between 210 and 750 ( and ). (The empirical rule is explained in greater detail in Chapter 6.) 99.7% The Empirical Rule 95% 68% X 3s X 2s X 1s X X + 1s X + 2s X + 3s from Elem. Stats., Bluman

14 Example The mean of times it takes a commuter to get to work in Baltimore is 29.7 minutes. Assume the distribution of commuter times is approximately bell shaped. (a) If the standard deviation is 6 minutes, within what limits would you expect 68% of times to fall?within what limits would you expect 68% of the times to fall? (b) Within what limits would you expect 95% of the times to fall? (c) Within what limits would you expect 99.7% of the times to fa

15 3.4 Measures of Position A measure of position determines the position of a single value in relation to other values in a sample or a population data set. Measures of position are quartiles, percentiles, and z scores. Quartiles and Interquartile Range Quartiles are the summary measures that divide a sorted data set into four equal parts. Three measures will divide any data set into four equal parts. These three measures are the first quartile (denoted by Q 1 ), the second quartile (denoted by Q 2 ), and the third quartile (denoted by Q 3 ). The data should be ranked in increasing order before the quartiles are determined. The quartiles are defined as follows. Definition Quartiles Quartiles are three summary measures that divide a ranked data set into four equal parts. The second quartile is the same as the median of a data set. The first quartile is the value of the middle term among the observations that are less than the median, and the third quartile is the value of the middle term among the observations that are greater than the median. Figure 3.11 describes the positions of the three quartiles. Each of these portions contains 25% of the observations of a data set arranged in increasing order Figure 3.11 Quartiles. 25% 25% 25% 25% Q 1 Q 2 Q 3 Approximately 25% of the values in a ranked data set are less than Q 1 and about 75% are greater than Q 1. The second quartile, Q 2, divides a ranked data set into two equal parts; hence, the second quartile and the median are the same. Approximately 75% of the data values are less than Q 3 and about 25% are greater than Q 3. The difference between the third quartile and the first quartile for a data set is called the interquartile range (IQR). Calculating Interquartile Range interquartile range; that is, The difference between the third and the first quartiles gives the IQR Interquartile range Q 3 Q 1.

16 EXAMPLE 3 20 Refer to the table below, which gives the 2008 profits (rounded to billions of dol-lars) of 12 companies selected from all over the world. That table is reproduced below. Finding quartiles and the interquartile range Profits Company (billions of dollars) Merck & Co 8 IBM 12 Unilever 7 Microsoft 17 Petrobras 14 Exxon Mobil 45 Lukoil 10 AT&T 13 Nestlé 17 Vodafone 13 Deutsche Bank 9 China Mobile 11 (a) (b) Find the values of the three quartiles. Where does the 2008 profits of Merck & Co fall in relation to these quartiles? Find the interquartile range.

17 Finding the interquartile range. (b) The value of Q 2, which is also the median, is given by the value of the middle term in the ranked data set. For the data of this example, this value is the average of the sixth and seventh terms. Consequently, Q 2 is $12.5 billion. The value of Q 1 is given by the value of the middle term of the six values that fall below the median (or Q 2 ). Thus, it is obtained by taking the average of the third and fourth terms. So, Q 1 is $9.5 billion. The value of Q 3 is given by the value of the middle term of the six values that fall above the median. For the data of this example, Q 3 is obtained by taking the average of the ninth and tenth terms, and it is $15.5 billion. The value of Q 1 $9.5 billion indicates that 25% of the companies in this sample had 2008 profits less than $9.5 billion and 75% of the companies had 2008 profits higher than $9.5 billion. Similarly, we can state that half of these companies had 2008 profits less than $12.5 billion and the other half had profits greater than $12.5 billion since the second quartile is $12.5 billion. The value of Q 3 $15.5 billion indicates that 75% of the companies had 2008 profits less than $15.5 billion and 25% had profits greater than this value. By looking at the position of $8 billion, which is the 2008 profit of Merck & Co, we can state that this value lies in the bottom 25% of the profits for The interquartile range is given by the difference between the values of the third and the first quartiles. Thus, IQR Interquartile range Q 3 Q $6 billion Finding quartiles and the interquartile range. EXAMPLE 3 21 The following are the ages (in years) of nine employees of an insurance company: (a) Find the values of the three quartiles. Where does the age of 28 years fall in relation to the ages of these employees? (b) Find the interquartile range.

18 Percentiles and Percentile Rank Percentiles are the summary measures that divide a ranked data set into 100 equal parts. Each (ranked) data set has 99 percentiles that divide it into 100 equal parts. The data should be ranked in increasing order to compute percentiles. The kth percentile is denoted by P k, where k is an integer in the range 1 to 99. For instance, the 25th percentile is denoted by P 25. Figure 3.12 shows the positions of the 99 percentiles. Each of these portions contains 1% of the observations of a data set arranged in increasing order 1% 1% 1% 1% 1% P 1 1%P 2 P 3 P 97 P 98 P 99 Figure 3.12 Percentiles. Thus, the kth percentile, P k, can be defined as a value in a data set such that about k% of the measurements are smaller than the value of P k and about (100 k)% of the measurements are greater than the value of P k. Calculating Percentiles The (approximate) value of the kth percentile, denoted by P k, is kn P k Value of the a 100 b th term in a ranked data set where k denotes the number of the percentile and n represents the sample size. EXAMPLE 3 22 Refer to the data on 2008 profits for 12 companies given in Example Find the value of the 42nd percentile. GGive a brief interpretation of the 42nd percentile. Finding the percentile for a data set. Here is the data arranged in increasing order are as follows:

19 We can also calculate the percentile rank for a particular value x i of a data set by using the formula given below. The percentile rank of x i gives the percentage of values in the data set that are less than x i. Finding Percentile Rank of a Value Percentile rank of x i Number of values less than x i 100 Total number of values in the data set Finding the percentile rank for a data value. EXAMPLE 3 23 Refer to the data on 2008 profits for 12 companies given in Example Find the percentile rank for $14 billion profit of Petrobras. GGive a brief interpretation of this percentile rank. The data arranged in increasing order are as follows:

20 Box-and-Whisker Plot A box-and-whisker plot gives a graphic presentation of data using five measures: the median, the first quartile, the third quartile, and the smallest and the largest values in the data set between the lower and the upper inner fences. (The inner fences are explained in Example 3 24 below.) A box-and-whisker plot can help us visualize the center, the spread, and the skewness of a data set. It also helps detect outliers. We can compare different distributions by making box-and-whisker plots for each of them. Definition Box-and-Whisker Plot A plot that shows the center, spread, and skewness of a data set. It is constructed by drawing a box and two whiskers that use the median, the first quartile, the third quartile, and the smallest and the largest values in the data set between the lower and the upper inner fences.

21 Constructing a box-and-whisker plot. EXAMPLE 3 24 The following data are the incomes (in thousands of dollars) for a sample of 12 households Construct a box-and-whisker plot for these data. Step 1. First, rank the data in increasing order and calculate the values of the median, the first quartile, the third quartile, and the interquartile range. The ranked data are Step 2. Find the points that are 1.5 IQR below Q 1 and 1.5 IQR above Q 3. These two points are called the lower and the upper inner fences, respectively. Step 3. Determine the smallest and the largest values in the given data set within the two inner fences. These two values for our example are as follows: Step 4. Draw a horizontal line and mark the income levels on it such that all the values in the given data set are covered. Above the horizontal line, draw a box with its left side at the position of the first quartile and the right side at the position of the third quartile. Inside the box, draw a vertical line at the position of the median. The result of this step is shown in Figure First quartile Median Third quartile Income Figure 3.13 Step 5. By drawing two lines,jjoin the points of the smallest and the largest values within the two inner fences to the box. These values are 69 and 112 in this example as listed in Step 3. The two lines that join the box to these two values are called whiskers. A value that falls outside the two inner fences is shown by marking an asterisk and is called an outlier. This completes the box-and-whisker plot, as shown in Figure 3.14.

22 Smallest value within the two inner fences First quartile Median Third quartile Largest value within the two inner fences An outlier Figure Income In Figure 3.14, about 50% of the data values fall within the box, about 25% of the values fall on the left side of the box, and about 25% fall on the right side of the box. Also, 50% of the values fall on the left side of the median and 50% lie on the right side of the median. The data of this example are skewed to the right because the lower 50% of the values are spread over a smaller range than the upper 50% of the values. The observations that fall outside the two inner fences are called outliers. These outliers can be classified into two kinds of outliers mild and extreme outliers. To do so, we define two outer fences a lower outer fence at 3.0 IQR below the first quartile and an upper outer fence at 3.0 IQR above the third quartile. If an observation is outside either of the two inner fences but within either of the two outer fences, it is called a mild outlier. An observation that is outside either of the two outer fences is called an extreme outlier. For the previous example, the outer fences are at 5 and 173. Because 144 is outside the upper inner fence but inside the upper outer fence, it is a mild outlier. For a symmetric data set, the line representing the median will be in the middle of the box and the spread of the values will be over almost the same range on both sides of the box.

23 Z Scores Key Concept This section introduces measures that can be used to compare values from different data sets, or to compare values within the same data set. The most important concept in this section is the z score, so we should understand the role of z scores (for comparing values from different data sets) and we should develop the ability to convert data values to z scores. z Scores A z score (or standardized value) is found by converting a value to a standardized scale, as given in the following definition. We will use z scores extensively in Chapter 6 and later chapters, so they are extremely important. Definition A z score (or standardized value), is the number of standard deviations that a given value x is above or below the mean. It is found using the following expressions: Sample Population z 5 x 2 x s (Round z to two decimal places.) or z 5 x 2 m s The following example illustrates how z scores can be used to compare values, even though they might come from different populations. EXAMPLE Comparing Heights With a height of 75 in., Lyndon Johnson was the tallest president of the past century. With a height of 85 in., Shaquille O Neal is the tallest player on the Miami Heat basketball team. Who is relatively taller: Lyndon Johnson among the presidents of the past century, or Shaquille O Neal among the players on his Miami Heat team? Presidents of the past century have heights with a mean of 71.5 in. and a standard deviation of 2.1 in. Basketball players for the Miami Heat have heights with a mean of 80.0 in. and a standard deviation of 3.3 in. SOLUTION The heights of presidents and basketball players are from very different populations, so a comparison requires that we standardize heights by converting them to z scores. Lyndon Johnson: z 5 x 2 m s 2.1 Shaquille O Neal: z 5 x 2 m s 3.3 INTERPRETATION Lyndon Johnson s height is 1.67 standard deviations above the mean, and Shaquille O Neal s height is 1.52 standard deviations

24 Unusual Values Ordinary Values Unusual Values 0 z Figure 3-5 Interpreting z Scores Unusual values are those with z scores less than 2.00 or greater than above the mean. Lyndon Johnson s height among presidents of the past century is relatively greater than Shaquille O Neal s height among the Miami Heat basketball players. Shaquille O Neal is much taller than Lyndon Johnson, but Johnson is relatively taller when compared to colleagues. z Scores and Unusual Values In Section 3-3 we used the range rule of thumb to conclude that a value is unusual if it is more than 2 standard deviations away from the mean. It follows that unusual values have z scores less than 22 or greater than 12. (See Figure 3-5.) Using this criterion, Lyndon Johnson is not unusually tall when compared to presidents of the past century, and Shaquille O Neal is not unusually tall when compared to his teammates, because neither of them has a height with a z score greater than 2. Ordinary values: 22 # z score # 2 Unusual values: z score,22 or z score. 2 While considering Miami Heat basketball players, the shortest player is Damon Jones with a height of 75 in. His z score is 21.52, as shown in the calculation below. (We again use m in. and s in. for the Miami Heat.) Damon Jones: z 5 x 2 m s Damon Jones height illustrates this principle about values that are below the mean: Whenever a value is less than the mean, its corresponding z score is negative. z scores are measures of position in the sense that they describe the location of a value (in terms of standard deviations) relative to the mean. A z score of 2 indicates that a value is two standard deviations above the mean, and a z score of 23 indicates that a value is three standard deviations below the mean. Quartiles and percentiles are also measures of position, but they are defined differently than z scores and they are useful for comparing values within the same data set or between different sets of data.

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

Lecture 11. Data Description Estimation

Lecture 11. Data Description Estimation Lecture 11 Data Description Estimation Measures of Central Tendency (continued, see last lecture) Sample mean, population mean Sample mean for frequency distributions The median The mode The midrange 3-22

More information

Chapter 3 Data Description

Chapter 3 Data Description Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population . Measures of Central Tendency: Mode, Median and Mean Average a single number that is used to describe the entire sample or population. Mode a. Easiest to compute, but not too stable i. Changing just one

More information

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest: 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts. Slide 1 Slide 2 Daphne Phillip Kathy Slide 3 Pick a Brick 100 pts 200 pts 500 pts 300 pts 400 pts 200 pts 300 pts 500 pts 100 pts 300 pts 400 pts 100 pts 400 pts 100 pts 200 pts 500 pts 100 pts 400 pts

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16 Stats Review Chapter Revised 8/16 Note: This review is composed of questions similar to those found in the chapter review and/or chapter test. This review is meant to highlight basic concepts from the

More information

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Unit 2: Numerical Descriptive Measures

Unit 2: Numerical Descriptive Measures Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

equal to the of the. Sample variance: Population variance: **The sample variance is an unbiased estimator of the

equal to the of the. Sample variance: Population variance: **The sample variance is an unbiased estimator of the DEFINITION The variance (aka dispersion aka spread) of a set of values is a measure of equal to the of the. Sample variance: s Population variance: **The sample variance is an unbiased estimator of the

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam: practice test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. ) Using the information in the table on home sale prices in

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

Measures of the Location of the Data

Measures of the Location of the Data Measures of the Location of the Data 1. 5. Mark has 51 films in his collection. Each movie comes with a rating on a scale from 0.0 to 10.0. The following table displays the ratings of the aforementioned

More information

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 04 - Sections 2.5 and 2.6 1. A travel magazine recently presented data on the annual number of vacation

More information

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 3.1-1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data Objectives 1. Approximate the mean of a variable from grouped data 2. Compute the weighted mean 3. Approximate the standard deviation

More information

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study CHAPTER PROBLEM Do women really talk more than men? A common belief is that women talk more than men. Is that belief founded in fact, or is it a myth? Do men actually talk more than women? Or do men and

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data Finding Quartiles. Use the median to divide the ordered data set into two halves.. If n is odd, do not include the median in either half. If n is even, split this data set exactly in half.. Q1 is the median

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

How spread out is the data? Are all the numbers fairly close to General Education Statistics

How spread out is the data? Are all the numbers fairly close to General Education Statistics How spread out is the data? Are all the numbers fairly close to General Education Statistics each other or not? So what? Class Notes Measures of Dispersion: Range, Standard Deviation, and Variance (Section

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

After completing this chapter, you should be able to:

After completing this chapter, you should be able to: Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Lecture 2. Descriptive Statistics: Measures of Center

Lecture 2. Descriptive Statistics: Measures of Center Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Chapter 3. Data Description. McGraw-Hill, Bluman, 7 th ed, Chapter 3

Chapter 3. Data Description. McGraw-Hill, Bluman, 7 th ed, Chapter 3 Chapter 3 Data Description McGraw-Hill, Bluman, 7 th ed, Chapter 3 1 Chapter 3 Overview Introduction 3-1 Measures of Central Tendency 3-2 Measures of Variation 3-3 Measures of Position 3-4 Exploratory

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

The empirical ( ) rule

The empirical ( ) rule The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 The breakfast cereal data Study collected data on nutritional

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Week 1: Intro to R and EDA

Week 1: Intro to R and EDA Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for

More information

GRAPHS AND STATISTICS Central Tendency and Dispersion Common Core Standards

GRAPHS AND STATISTICS Central Tendency and Dispersion Common Core Standards B Graphs and Statistics, Lesson 2, Central Tendency and Dispersion (r. 2018) GRAPHS AND STATISTICS Central Tendency and Dispersion Common Core Standards Next Generation Standards S-ID.A.2 Use statistics

More information

FSA Algebra I End-of-Course Review Packet

FSA Algebra I End-of-Course Review Packet FSA Algebra I End-of-Course Review Packet Table of Contents MAFS.912.N-RN.1.2 EOC Practice... 3 MAFS.912.N-RN.2.3 EOC Practice... 5 MAFS.912.N-RN.1.1 EOC Practice... 8 MAFS.912.S-ID.1.1 EOC Practice...

More information

Topic-1 Describing Data with Numerical Measures

Topic-1 Describing Data with Numerical Measures Topic-1 Describing Data with Numerical Measures Central Tendency (Center) and Dispersion (Variability) Central tendency: measures of the degree to which scores are clustered around the mean of a distribution

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Numerical Measures of Central Tendency

Numerical Measures of Central Tendency ҧ Numerical Measures of Central Tendency The central tendency of the set of measurements that is, the tendency of the data to cluster, or center, about certain numerical values; usually the Mean, Median

More information

Let's Do It! What Type of Variable?

Let's Do It! What Type of Variable? Ch Online homework list: Describing Data Sets Graphical Representation of Data Summary statistics: Measures of Center Box Plots, Outliers, and Standard Deviation Ch Online quizzes list: Quiz 1: Introduction

More information

Sections 6.1 and 6.2: The Normal Distribution and its Applications

Sections 6.1 and 6.2: The Normal Distribution and its Applications Sections 6.1 and 6.2: The Normal Distribution and its Applications Definition: A normal distribution is a continuous, symmetric, bell-shaped distribution of a variable. The equation for the normal distribution

More information

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data CHAPTER 1 Exploring Data 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers 1.3 Reading Quiz True or false?

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS 1 3 DESCRIPTIVE STATISTICS. 3.1 Measures of Central Tendency Because the data values of most numerical variables show a tendency to group around a specific value, statisticians use a set of methods, collectively

More information

Let's Do It! What Type of Variable?

Let's Do It! What Type of Variable? 1 2.1-2.3: Organizing Data DEFINITIONS: Qualitative Data are those which classify the units into categories. The categories may or may not have a natural ordering to them. Qualitative variables are also

More information

Review: Central Measures

Review: Central Measures Review: Central Measures Mean, Median and Mode When do we use mean or median? If there is (are) outliers, use Median If there is no outlier, use Mean. Example: For a data 1, 1.2, 1.5, 1.7, 1.8, 1.9, 2.3,

More information

Topic 5: Statistics 5.3 Cumulative Frequency Paper 1

Topic 5: Statistics 5.3 Cumulative Frequency Paper 1 Topic 5: Statistics 5.3 Cumulative Frequency Paper 1 1. The following is a cumulative frequency diagram for the time t, in minutes, taken by students to complete a task. Standard Level Write down the median.

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty

= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty Chapter Slides Inferential Statistics and Probability a Holistic Approach Chapter Descriptive Statistics This Course Material by Maurice Geraghty is licensed under a Creative Commons Attribution-ShareAlike.

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Population - all items of interest for a particular decision

More information

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz Measures of Central Tendency and their dispersion and applications Acknowledgement: Dr Muslima Ejaz LEARNING OBJECTIVES: Compute and distinguish between the uses of measures of central tendency: mean,

More information