STT 315 This lecture is based on Chapter 2 of the textbook.

Size: px
Start display at page:

Download "STT 315 This lecture is based on Chapter 2 of the textbook."

Transcription

1 STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides. 1

2 Topic of this chapter These materials can be read from Chapter of the textbook. We shall first cover some descriptive statistics of qualitative variables (Ch2.1). Later we shall study descriptive statistics of quantitative variables (Ch ). In descriptive statistics we summarize data through graphs and tables. 2

3 How to display Qualitative Data? Frequency Tables Bar graph (or bar chart) Pie chart (or pie diagram) Pareto chart (or Pareto diagram) 3

4 Qualitative variables Qualitative or categorical variable cannot be usually measured in numerical scale, and simply records quality. Each category of a qualitative variable is also called class or level. For instance, the qualitative variable GENDER has two classes, namely Male and Female. If we count number of observations belonging to each class, then this count is called class frequency or simply frequency. Relative frequency of a class is obtained by dividing the class frequency by total number of observations. 4

5 Frequency Tables These are tables in which classes (categories) are written in the left most column and the corresponding counts are written in the second column. Count is also known as frequency. Sometimes proportions (or percentages) are also written instead of or in addition to the actual counts. Proportion is also called relative frequency. 5

6 Frequency Table: An Example Frequency Table of the number of Golf Balls sold in different days of a week Day # of Golf Balls Sold % of Golf Balls Sold (Frequency) Monday Tuesday Wednesday Thursday Friday Total

7 Bar Charts A bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they represent. The bars can be plotted vertically (more common) or horizontally (less common). The percentage or relative proportions can also be plotted instead of the actual values. 7

8 Bar Chart: Golf Balls Sold # of Golf Balls Sold Monday Tuesday Wednesday Thursday Friday 8

9 Bar Chart: % Golf Balls Sold % golf balls Monday Tuesday Wednesday Thursday Friday 9

10 Pie Chart A pie chart (or a circle graph) is a circular chart divided into sectors, illustrating proportion. The arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents. The math is carried out based on the following: 100% is same as 360 degrees. 10

11 Pie Chart: Golf Ball Sold % of Golf Balls Sold 25% 20% 15% Monday Tuesday Wednesday Thursday Friday 23% 17% 11

12 Pie Chart: An Example Pie Chart of English Native Speakers 12

13 Bar Chart vs. Pie Chart Bar chart is used more often to represent the actual values while pie chart is used to represent relative proportions (in %). When comparison of relative proportion is important, pie chart is more appropriate. When the absolute counts or values are more important, a bar chart should be used. 13

14 Major points so far First step in organizing data draw a picture Appropriate pictures for categorical data Pie chart Bar chart 14

15 Pareto diagram Pareto diagram is a particular type of bar diagram in which the classes are arranged on the horizontal axis in decreasing frequencies. That means in Pareto diagram the leftmost class has the highest frequency bar, followed by the class with next highest frequency bar, and so on. 15

16 The following Pareto diagram represents the incarceration rate (per people) of various countries. 16

17 Displaying Quantitative Data Histograms Stem-and-Leaf Displays Dotplots 17

18 Histograms Histogram is a graphical representation, showing a visual impression of the distribution of quantitative data. It consists of adjacent rectangles, erected over intervals (also known as bins or classes). The lengths of the intervals may be different. The interval may contain a single value. The heights are equal to the number (frequency) of the observations in the corresponding bins. Sometimes percentages (or relative frequencies) are also represented by the heights. 18

19 Histogram: An Example The heights of 31 Black Cherry trees 19

20 How to choose the bin size? Let the computer decide it for you. What happens for the observations in the boundary of two bins? Put them in the higher bin. Don t we lose information? Yes, we do. A Few Questions 20

21 Stem-and-Leaf Display Another device for presenting quantitative data in a graphical format. Assists in visualizing the shape of the distribution of the observations. Unlike histograms, stem-and-leaf displays retain the original data. Contains two columns separated by a vertical line. The left column contains the stems and the right column contains the leaves. Suppose we have the following data on weights (in lb) of 17 school-kids:

22 How do they work? Sorted data: Stem Leaf key: 6 3 = 63 leaf unit: 1.0 stem unit:

23 Dotplots A dotplot is a statistical chart consisting of group of data points plotted on a simple scale. They can be drawn both horizontally and vertically. 23

24 Summary We have learnt three methods of displaying quantitative data: histogram, stem-and-leaf display and dotplot. When the data-size is small, stem-and-leaf display and dotplot are more useful. When the data-size is large, histogram is more useful. 24

25 Distribution of the Data-points Three important features: Shape of the distribution, Center of the distribution, Spread of the distribution. 25

26 Shape of a Distribution: Modes The peaks of a histogram are called modes. A distribution is unimodal if it has one mode, bimodal if it has two modes, multimodal if it has three or more modes. 26

27 Unimodal, Bimodal or Multimodal? Unimodal Bimodal Multimodal 27

28 Uniform Histogram A histogram that doesn t appear to have any mode. All the bars are approximately the same. 28

29 Shape of a Distribution: Symmetry If the histogram can be folded along a vertical line through the middle and have the edges match pretty closely, then the distribution is symmetric. Otherwise, it is skewed. 29

30 Skewed to the left or right? Skewed to the left ( tail is in left) Skewed to the right (Tail is in right) 30

31 Shape of a Distribution: Outliers Outliers are the data-points that stand off away from the body of the histogram. They are too high or too low compared to most of the observations. 31

32 The following distribution is A. Unimodal and skewed to the left B. Bimodal and skewed to the right C. Bimodal and symmetric D. Multimodal and symmetric E. Unimodal and skewed to the right 32

33 Does this distribution have an outlier? (a) Yes, it does (b) No, it doesn t 33

34 The following distribution is A. Unimodal and skewed to the left B. Bimodal and skewed to the right C. Bimodal and symmetric D. Multimodal and symmetric E. Unimodal and skewed to the right 34

35 Numerical measures for quantitative data 35

36 Center of a Distribution Median: The middlemost observation when the data is sorted in increasing order Median can always be used as the center of a distribution. Mean: The average of all data-points. Mean can be used as the center of a distribution when the distribution is symmetric. 36

37 What is Median? Median is the middlemost observation when the data is sorted in increasing order. Data: 23, 33, 12, 39, 27 Sorted Data: 12, 23, 27, 33, 39 Median: 27 37

38 What if there are even number of observations? Take the average of two middlemost observations in that case Data: 23, 33, 12, 39, 27, 10 Sorted Data: 10, 12, 23, 27, 33, 39 Median = (23+27)/2 =

39 What is the general rule? Suppose there are n observations. Sort them in increasing order. If n is odd then the median is the observation in the (n+1)/2 th position. If n is even, then the median is the average of the observations in the (n/2) th and (n/2 + 1) th positions. 39

40 When n is odd Data: 23, 33, 12, 39, 27 n = 5 (odd) Sorted Data: 12, 23, 27, 33, 39 Median = observation in the (5+1)/2 th position = observation in the 3 rd position =

41 When n is even Data: 23, 33, 12, 39, 27, 10 n = 6 (even) Sorted Data: 10, 12, 23, 27, 33, 39 Median = average of the observations in the (6/2) th and (6/2 +1) th positions = average of the observations in the 3 rd and 4 th positions = (23+27)/2 =

42 What is mean? Mean is the average of all the observations (i.e., add up all the values and divide by the number of values). If an observation repeats, we add it the number of times it repeats when we calculate the average. Mean can be used as the center of a distribution when the distribution is symmetric. Data: 10, 13, 18, 22, 29 Mean = ( )/5 =

43 Mean vs. Median Data: 10, 13, 18, 22, 29 Without the outlier: Mean = Median = 18 Data: 10, 13, 18, 22, 29, 68 With the outlier: Mean = Median = 20 Conclusion: Mean is more outlier-sensitive compared to the median. 43

44 Mean vs. Median Mean is more outlier-sensitive compared to median. For a symmetric distribution, mean = median. Thus mean is more useful as the center of a distribution when the distribution is symmetric. But median can always be used as the center of a distribution. For a right-skewed distribution, mean > median. For a left-skewed distribution, mean < median. Learn to use TI 83/84 Plus to compute mean and median. 44

45 TI 83/84 Plus commands To enter the data: Press [STAT] Under EDIT select 1: Edit and press ENTER Columns with names L1, L2 etc. will appear Type the data value under the column; each data entry will be followed by ENTER. To clear data: Pressing CLEAR will clear the particular data. To clear all data from all columns press [2nd] & + and then choose 4: ClrAllLists. 45

46 TI 83/84 Plus commands 46

47 Effect of Linear Transformation Suppose every observation is multiplied by a fixed constant. Then median of transformed observations is the median of the original observations times that same constant. mean of transformed observations is the mean of the original observations times that same constant. Data: 10, 13, 18, 22, 29 Mean = Median = 18. Suppose transformed data = (-3)*original data. So transformed data: -30, -39, -54, -66, -87 Mean = (-3)*18.40 = Median = (-3)*18 =

48 Effect of Linear Transformation Suppose a fixed constant is added to (or subtracted from) each observation. Then median of transformed observations is the median of the original observations plus (or minus) that same constant. mean of transformed observations is the mean of the original observations plus (or minus) that same constant. Data: 10, 13, 18, 22, 29 Mean = Median = 18. Suppose transformed data = original data Hence transformed data: 12.5, 15.5, 20.5, 24.5, 31.5 Mean = = Median = =

49 Spread of a Distribution Are the values concentrated around the center of the distribution or they are spread out? Range, Interquartile Range, Variance, Standard Deviation. Note: Variance and standard deviation are more appropriate when the distribution is symmetric. 49

50 Range Range of the data is defined as the difference between the maximum and the minimum values. Data: 23, 21, 67, 44, 51, 12, 35. Range = maximum minimum = = 55. Disadvantage: A single extreme value can make it very large, giving a value that does not really represent the data overall. On the other hand, it is not affected at all if some observation changes in the middle. 50

51 Interquartile Range (IQR) What is IQR? IQR = Third Quartile (Q 3 ) First Quartile (Q 1 ). What are quartiles? Recall: Median divides the data into 2 equal halves. The first quartile, median and the third quartile divide the data into 4 roughly equal parts. 51

52 Quartiles The first quartile (Q 1, lower quartile) is that value which is larger than 25% of observations, but smaller than 75% of observations. The second quartile (Q 2 ) is the median, which is larger than 50% of observations, but smaller than 50% of observations. The third quartile (Q 3, upper quartile) is that value which is larger than 75% of observations, but smaller than 25% of observations. Obviously, Q 1 < Q 2 (= median) < Q 3. How to compute the quartiles? We shall use TI 83/84 Plus. 52

53 IQR vs. Range IQR is a better summary of the spread of a distribution than the range because it has some information about the entire data, where as range only has information on the extreme values of the data. IQR is less outlier-sensitive than range. 53

54 Outlier-sensitivity Data: 10, 13, 17, 21, 28, 32 Without the outlier IQR = 15 Range = 22 Data: 10, 13, 17, 21, 28, 32, 59 With the outlier IQR = 19 Range = 49 Conclusion: IQR is less outlier-sensitive than range. 54

55 Variance and Standard Deviation The sample variance (s 2 ) is defined as: s ( x1 x) ( xn x). n 1 Subtract the mean from each value, square each difference, add up the squares, divide by one fewer than the sample size. The sample standard deviation (s), is the positive square root of sample variance, i.e. s 2 s. 55

56 Variance and Standard Deviation Larger the variance (and standard deviation) more dispersed are the observations around the mean. The unit of variance is square of the unit of the original data, whereas standard deviation has the same unit as the original data. Both variance and standard deviation are more appropriate for symmetric distributions. 56

57 Standard Deviation: An Example Data: 3, 12, 8, 9, 3 (n=5 in this case) Mean = ( )/5 = 35/5 =7. Data Deviations from mean Squared Deviations = -4 (-4)x(-4) = = 5 5 x 5 = = 1 1 x 1 = = 2 2 x 2 = = -4 (-4)x(-4) = Total = 62 Now divide by n-1=4: s 2 = 62/4 = s = 15.5 = Answer: and the The variance standard is deviation in this example is

58 Effect of Linear Transformation Suppose every observation is multiplied by a fixed constant. Then range/iqr/standard deviation of transformed observations is the range/iqr/standard deviation of the original observations times the absolute value of that same constant. variance of transformed observations is the variance of the original observations times the square of that same constant. Temperature data (in F): 10, 13, 18, 22, 29 Range = 19 F, IQR =14 F, s = 7.5 F, s 2 = F 2. Suppose transformed data = (-3)*original data. So transformed data (in F): -30, -39, -54, -66, -87 Range = -3 *19 = 57 F, IQR = -3 *14 = 42 F, s = -3 * 7.5 = F, s 2 = (-3) 2 *56.25 = F 2. 58

59 Effect of Linear Transformation Suppose a fixed constant is added to (or subtracted from) each observation. Then range/iqr/standard deviation/variance of transformed observations remains the same as that of the original observations. Temperature data (in F): 10, 13, 18, 22, 29 Range = 19 F, IQR =14 F, s = 7.5 F, s 2 = F 2. Suppose transformed data = original data Hence transformed data (in F): 12.5, 15.5, 20.5, 24.5, 31.5 Range = 19 F, IQR =14 F, s = 7.5 F, s 2 = F 2. 59

60 Empirical rule & Chebyshev s rule 60

61 Empirical rule For approximately symmetric unimodal (bellshaped/mound shaped) distribution Approximately 68% of observations fall within 1 standard deviation of mean. Approximately 95% of observations fall within 2 standard deviations of mean. Approximately 99.7% of observations fall within 3 standard deviations of mean. 61

62 Empirical rule 62

63 Empirical rule 63

64 Chebyshev s rule For any distribution at least 1 1 k2 of the observations will fall within k standard deviations of mean, where k 1. Chebyshev s rule is for any distribution, whereas the empirical rule is valid only for approximately symmetric unimodal (mound-shaped) distribution. If k=1, not much information is available from Chebyshev s rule. According to Chebyshev at least 75% observations fall within 2 standard deviations of mean. According to Chebyshev at least 88.9% of observations fall within 3 standard deviations of mean. 64

65 Box plot 65

66 Box Plot Box plot is a graphical representation of the following 5 number summary: 1. Minimum Value, 2. Lower Quartile, 3. Median (the middle value), 4. Upper Quartile, 5. Maximum Value. NOTE: Data must be ordered from lowest value to highest value before finding the 5 number summary. 66

67 Box Plots Are a representation of the five number summary (Minimum, Maximum, Median, Lower Quartile, Upper Quartile). Half the data are in the box One-quarter of the data are in each whisker. If one part of the plot is long, the data are skewed. Box-plot is very useful for comparing distributions This box plot indicates data are skewed to the left. 67

68 Box Plot Box Plot is a pictorial representation of the 5-number summary. 68

69 Outliers Any observation farther than 1.5 times IQR from the closest boundary of the box is an outlier. If it is farther than 3 times IQR, it is an extreme outlier, otherwise a mild outlier. One can also indicate the outliers in a box plot, by drawing the whiskers only up to 1.5 times IQR on both sides, and indicating outliers with stars or crosses (or other symbols). 69

70 Suppose An example min = 2, Q 1 = 18, median = 20, Q 3 = 22, max = 35. Which of the following observations are outliers? A. 10 B. 15 C.25 D.30 70

71 Histogram vs. Box plot Both histogram and box plot capture the symmetry or skewness of distributions. Box plot cannot indicate the modality of the data. Box plot is much better in finding outliers. The shape of histogram depends to some extent on the choice of bins. 71

72 Comparing Distributions We can compare between distributions of various data-sets using Box Plots (or the 5-Number Summary), Histograms. We shall first compare distributions using box plots.

73 Which type of car has the largest median Time to accelerate? A. upscale B. sports C. small D. large E. family 73

74 Which type of car has the smallest median time value? A. upscale B. sports C. small D. Large E. Luxury 74

75 Which type of car always take less than 3.6 seconds to accelerate? A. upscale B. sports C. small D. Large E. Luxury 75

76 Which type of car has the smallest IQR for Time to accelerate? A. upscale B. sports C. small D. Large E. Luxury 76

77 What is the shape of the distribution of acceleration times for luxury cars? A. Left skewed B. Right skewed C. Roughly symmetric D. Cannot be determined from the information given. 77

78 What percent of luxury cars accelerate to 30 mph in less than 3.5 seconds? A. Roughly 25% B. Exactly 37.5% C. Roughly 50% D. Roughly 75% E. Cannot be determined from the information given 78

79 What percent of family cars accelerate to 30 mph in less than A. Less than 25% B. More than 50% C. Less than 50% D. Exactly 75% E. None of the above 3.5 seconds? 79

80 Comparing Distributions Use of Histograms

81 FREQUENCY Which data have more A 6 variability? 6 B SCORE A. Graph A B. Graph B SCORE C. Both have the same variability 81

82 Which data have more A variability? B A. Graph A B. Graph B C. Both have the same variability 82

83 Which data have a higher A median? B A. Graph A B. Graph B C. Both have the same median 83

84 FREQUENCY FREQUENCY Which data have more A 6 variability? 6 B SCORE A. Graph A B. Graph B SCORE C. Roughly, both have the same variability 84

85 z-score 85

86 How to compare apples with oranges? A college admissions committee is looking at the files of two candidates, one with a total SAT score of 1500 and another with an ACT score of 22. Which candidate scored better? How do we compare things when they are measured on different scales? We need to standardize the values. 86

87 How to standardize? Subtract mean from the value and then divide this difference by the standard deviation. The standardized value = the z-score value mean std.dev. z-scores are free of units. 87

88 z-scores: An Example Data: 4, 3, 10, 12, 8, 9, 3 (n=7 in this case) Mean = ( )/7 = 49/7 =7. Standard Deviation = Original Value z-score (4 7)/3.65 = (3 7)/3.65 = (10 7)/3.65 = (12 7)/3.65 = (8 7)/3.65 = (9 7)/3.65 = (3 7)/3.65 =

89 Interpretation of z-scores The z-scores measure the distance of the data values from the mean in the standard deviation scale. A z-score of 1 means that data value is 1 standard deviation above the mean. A z-score of -1.2 means that data value is 1.2 standard deviations below the mean. Regardless of the direction, the further a data value is from the mean, the more unusual it is. A z-score of -1.3 is more unusual than a z-score of

90 How to use z-scores? A college admissions committee is looking at the files of two candidates, one with a total SAT score of 1500 and another with an ACT score of 22. Which candidate scored better? SAT score mean = 1600, std dev = 500. ACT score mean = 23, std dev = 6. SAT score 1500 has z-score = ( )/500 = ACT score 22 has z-score = (22-23)/6 = ACT score 22 is better than SAT score

91 Which is more unusual? A. A 58 in tall woman z-score = ( )/2.5 = B. A 64 in tall man z-score = (64-69)/2.8 = C. They are the same. Heights of adult women have mean of 63.6 in. std. dev. of 2.5 in. Heights of adult men have mean of 69.0 in. std. dev. of 2.8 in. 91

92 Using z-scores to solve problems An example using height data and U.S. Marine and Army height requirements Question: Are the height restrictions set up by the U.S. Army and U.S. Marine more restrictive for men or women or are they roughly the same? 92

93 Data from a National Health Survey Heights of adult women have mean of 63.6 in. standard deviation of 2.5 in. Heights of adult men have mean of 69.0 in. standard deviation of 2.8 in. Height Restrictions Men Minimum Women Minimum U.S. Army 60 in 58 in U.S. Marine Corps 64 in 58 in 93

94 Heights of adult men have mean of 69.0 in. standard deviation of 2.8 in. Heights of adult women have mean of 63.6 in. standard deviation of 2.5 in. Men Minimum 60 in Women minimum 58 in U.S. Army U.S. Marine z-score = Less restrictive 64 in z-score = z-score = More restrictive 58 in z-score = More restrictive Less restrictive 94

95 Effect of Standardization Standardization into z-scores does not change the shape of the histogram. Standardization into z-scores changes the center of the distribution by making the mean 0. Standardization into z-scores changes the spread of the distribution by making the standard deviation 1. 95

96 Z-score and Empirical Rule When data are bell shaped, the z-scores of the data values follow the empirical rule. 96

97 Outlier detection with z-score Empirical Rule tells us that if data are mound-shaped distributed, then almost all the data-points are within plus minus 3 standard deviations from the mean. So an absolute value of z-score larger than 3 can be considered as an outlier. 97

98 2004 Olympics Women s Heptathlon Austra Skujyte (Lithunia) Shot Put = 16.40m, Long Jump = 6.30m. Carolina Kluft (Sweden) Shot Put = 14.77m, Long Jump = 6.78m. Shot Put Long Jump Mean (all contestant) 13.29m 6.16m Std.Dev. 1.24m 0.23m n

99 Which performance was better? A. Skujyte s shot put, z-score of Skujyte s shot put = B. Kluft s long jump, z-score of Kluft s long jump = C. Both were same. Mean (all contestant) Shot Put Long Jump 13.29m 6.16m Std.Dev. 1.24m 0.23m n

100 Based on shot put and long jump whose performance was better? A. Skujyte s, z-score: shot put = 2.51, long jump = Total z-score = ( ) = B. Kluft s, z-score: shot put = 1.19, long jump = Total z-score = ( ) = C. Both were same. 100

101 Scatterplot 101

102 Example: Height and Weight How is weight of an individual related to his/her height? Typically, one can expect a taller person to be heavier. Is it supported by the data? If yes, how to determine this association? 102

103 What is a scatterplot? A scatterplot is a diagram which is used to display values of two quantitative variables from a data-set. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. 103

104 Example 1: Scatterplot of height and weight 104

105 Example 2: Scatterplot of hours watching TV and test scores 105

106 Looking at Scatterplots We look at the following features of a scatterplot:- Direction (positive or negative) Form (linear, curved) Strength (of the relationship) Unusual Features. When we describe histograms we mention Shape Center Spread Outliers 106

107 Asking Questions on a Scatterplot Are test scores higher or lower when the TV watching is longer? Direction (positive or negative association). Does the cloud of points seem to show a linear pattern, a curved pattern, or no pattern at all? Form. If there is a pattern, how strong does the relationship look? Strength. Are there any unusual features? (2 or more groups or outliers). 107

108 Positive and Negative Associations Positive association means for most of the datapoints, a higher value of one variable corresponds to a higher value of the other variable and a lower value of one variable corresponds to a lower value of the other variable. Negative association means for most of the datapoints, a higher value of one variable corresponds to a lower value of the other variable and vice-versa. 108

109 This association is: A. positive B. negative. 109

110 This association is: A. positive B. negative. 110

111 Linear Scatterplot Unless we see a curve, we shall call the scatterplot linear. 111

112 Curved Scatterplot When the plot shows a clear curved pattern, we shall call it a curved scatterplot. 112

113 Which one has stronger linear association? A.left one, B.right one. Because, in the right graph the points are closer to a straight line. 113

114 Which one has stronger linear A.left one, B.right one. association? Hard to say. 114

115 Unusual Feature: Presence of Outlier This scatterplot clearly has an outlier. 115

116 Unusual Feature: Two Subgroups This scatterplot clearly has two subgroups. 116

117 Time series plot (Time plot) 117

118 Time plot Time series is a collection of observations made sequentially through time. In time plot (or time series plot) the time series data are plotted (on vertical axis) against the time (on horizontal axis), and the plots are connected with straight line. From time series plot one can find the movement of the observed values over time and find patterns such as: Trend Seasonality Business cycle (for business data) Unusual features 118

119 US Example: US population Time Series Plot of US t

120 deaths Example: US accidental death Time Series Plot of deaths Index

121 Example: Australian red wine sell 121

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data Finding Quartiles. Use the median to divide the ordered data set into two halves.. If n is odd, do not include the median in either half. If n is even, split this data set exactly in half.. Q1 is the median

More information

Numerical Measures of Central Tendency

Numerical Measures of Central Tendency ҧ Numerical Measures of Central Tendency The central tendency of the set of measurements that is, the tendency of the data to cluster, or center, about certain numerical values; usually the Mean, Median

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

The empirical ( ) rule

The empirical ( ) rule The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

Lecture 1: Description of Data. Readings: Sections 1.2,

Lecture 1: Description of Data. Readings: Sections 1.2, Lecture 1: Description of Data Readings: Sections 1.,.1-.3 1 Variable Example 1 a. Write two complete and grammatically correct sentences, explaining your primary reason for taking this course and then

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) 3. Descriptive Statistics Describing data with tables and graphs (quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) Bivariate descriptions

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

8/4/2009. Describing Data with Graphs

8/4/2009. Describing Data with Graphs Describing Data with Graphs 1 A variable is a characteristic that changes or varies over time and/or for different individuals or objects under consideration. Examples: Hair color, white blood cell count,

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67 Chapter 6 The Standard Deviation as a Ruler and the Normal Model 1 /67 Homework Read Chpt 6 Complete Reading Notes Do P129 1, 3, 5, 7, 15, 17, 23, 27, 29, 31, 37, 39, 43 2 /67 Objective Students calculate

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest: 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

The Empirical Rule, z-scores, and the Rare Event Approach

The Empirical Rule, z-scores, and the Rare Event Approach Overview The Empirical Rule, z-scores, and the Rare Event Approach Look at Chebyshev s Rule and the Empirical Rule Explore some applications of the Empirical Rule How to calculate and use z-scores Introducing

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

a table or a graph or an equation.

a table or a graph or an equation. Topic (8) POPULATION DISTRIBUTIONS 8-1 So far: Topic (8) POPULATION DISTRIBUTIONS We ve seen some ways to summarize a set of data, including numerical summaries. We ve heard a little about how to sample

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago Stat 22000 Lecture Slides Exploring Numerical Data Yibi Huang Department of Statistics University of Chicago Outline In this slide, we cover mostly Section 1.2 & 1.6 in the text. Data and Types of Variables

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to

More information

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table Lesson Plan Answer Questions Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 1 2. Summary Statistics Given a collection of data, one needs to find representations

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

Statistics lecture 3. Bell-Shaped Curves and Other Shapes

Statistics lecture 3. Bell-Shaped Curves and Other Shapes Statistics lecture 3 Bell-Shaped Curves and Other Shapes Goals for lecture 3 Realize many measurements in nature follow a bell-shaped ( normal ) curve Understand and learn to compute a standardized score

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27 Chapter 7: Statistics Describing Data Chapter 7: Statistics Describing Data 1 / 27 Categorical Data Four ways to display categorical data: 1 Frequency and Relative Frequency Table 2 Bar graph (Pareto chart)

More information

Introduction to Probability and Statistics Slides 1 Chapter 1

Introduction to Probability and Statistics Slides 1 Chapter 1 1 Introduction to Probability and Statistics Slides 1 Chapter 1 Prof. Ammar M. Sarhan, asarhan@mathstat.dal.ca Department of Mathematics and Statistics, Dalhousie University Fall Semester 2010 Course outline

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2

Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2 Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2 Topics: 1. Mean 2. Mode 3. Median 4. Order Statistics 5. Minimum, Maximum, Range 6. Percentiles, Quartiles, Interquartile Range

More information

Range The range is the simplest of the three measures and is defined now.

Range The range is the simplest of the three measures and is defined now. Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Descriptive statistics

Descriptive statistics Patrick Breheny February 6 Patrick Breheny to Biostatistics (171:161) 1/25 Tables and figures Human beings are not good at sifting through large streams of data; we understand data much better when it

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

Unit 1: Statistics. Mrs. Valentine Math III

Unit 1: Statistics. Mrs. Valentine Math III Unit 1: Statistics Mrs. Valentine Math III 1.1 Analyzing Data Statistics Study, analysis, and interpretation of data Find measure of central tendency Mean average of the data Median Odd # data pts: middle

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.)

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.) Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.) PRESENTATION OF DATA 1. Mathematical presentation (measures of central tendency and measures of dispersion). 2. Tabular

More information

Continuous random variables

Continuous random variables Continuous random variables A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The total area under a density

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information