Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:
|
|
- Chastity Quinn
- 5 years ago
- Views:
Transcription
1 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number of students per class is as follows: The mean is: Median The median is another measure of location for a variable. The median is the value in the middle when the data are arranged in ascending order (smallest to largest value). Computation: o Arrange the data in ascending order (smallest to largest value) o For an odd number of observations, the median is the middle value o For an even number of observations, the median is the average of the middle 2 values Example: The number of students per class is as follows: The median is: Arrange the values from smallest to largest: Middle value = Median = 46 Copyright Reserved 1
2 2 Example The yearly income (R1000 s) of 8 workers is as follows: Calculate the mean and the median. Answers: Mean/average: Median: For the median, we arrange the values from smallest to largest: Mode Median = Although the mean is the more commonly used measure of central location, in some situations the median is preferred. The mean is influenced by extremely small and large data values, while the median is not influenced by extreme values. Definition: The mode is the value that occurs with greatest frequency. Example: The number of students per class is as follows: The mode is: 46 Note: Bi-modal: If the data have exactly 2 modes. Example of a bi-modal data set: Multimodal: If data have more than 2 modes. Copyright Reserved 2
3 3 Example: Give the appropriate measure of location for the following data: Soft drink Frequency Coke Classic 19 Diet Coke 8 Dr. Pepper 5 Pepsi-Cola 13 Sprite 5 The mode is: Coke Classic For this type of data it obviously makes no sense to speak of the mean or median. Using Microsoft Excel 2007 to compute the mean, median and mode Formula worksheet Value worksheet Copyright Reserved 3
4 4 Percentiles Definition: The p th percentile is a value such that at least p percent of the observations are less than or equal to this value and at least (100 p) percent of the observations are greater than or equal to this value. Calculating the p th percentile: Arrange the data in ascending order (smallest to largest value) Compute an index i Example: ( ) where p = percentile of interest n = sample size (a) If i is not an integer, round up (b) If i is an integer, the p th percentile is the average of the values in positions i and (i +1) Determine the 85 th percentile ( ) for the starting salary data: Step 1: Arrange the data in ascending order Step 2: ( ) ( ) ( ) Step 3: In the 11 th position (after being arranged in ascending order):. Interpretation: 85% of the graduates have a starting salary of R3 730 or less. Copyright Reserved 4
5 5 Determine the 33 rd percentile ( starting salary: ) for the Determine the median ( salary: ) for the starting Step 1: Arrange the data in ascending order Step 1: Arrange the data in ascending order Step 2: ( ) ( ) ( ) Step 2: ( ) ( ) ( ) i + 1 = 7 Step 3: In the 4 th position (after being arranged in ascending order):. Step 3: The median is the average of the values in the 6 th and 7 th positions: Interpretation: 33% of the graduates have a starting salary of R3 480 or less. Interpretation: 50% of the graduates have a starting salary of R3 505 or less. Copyright Reserved 5
6 6 Determine the 25 th percentile ( starting salary: ) for the Determine the 75 th percentile ( starting salary: ) for the Step 1: Arrange the data in ascending order Step 1: Arrange the data in ascending order Step 2: ( ) ( ) ( ) i + 1 = 4 Step 2: ( ) ( ) ( ) i + 1 = 10 Step 3: is the average of the values in the 3 rd and 4 th positions: Step 3: is the average of the values in the 9 th and 10 th positions: Interpretation: 25% of the graduates have a starting salary of R3 465 or less. Interpretation: 75% of the graduates have a starting salary of R3 600 or less. Copyright Reserved 6
7 7 Quartiles First quartile, 25 th percentile Second quartile, 50 th percentile, median Third quartile, 75 th percentile 3.2 Measures of variability Range Range = Largest Value Smallest Value Range Example of the salary data. The range is: = = 615 Advantages: o Easy to calculate Disadvantages: o It s sensitive to just 2 data values: the Largest Value and the Smallest Value. o Unstable, it is influenced by extreme values. Suppose one of the graduates received a starting salary of per month. Then the range is equal to: The range is: = = Copyright Reserved 7
8 8 Interquartile Range - IQR It s the range for the middle 50% of the data Example of the salary data. The interquartile range for the salary data is: Advantages: o Easy to interpret o Is not influenced by extreme values Disadvantages: o It s only based on the middle 50% of the data. Variance The variance is a measure of variability that utilizes all the data Example Given: The Sample Variance ( ) Standard Deviation Sample Standard Deviation and therefore ( ) Copyright Reserved 8
9 9 Example Calculate the standard deviation of the class sizes. Number of students in class ( ) Mean class size ( ) Deviation about the mean ( ) Squared deviation about the mean ( ) ( ) ( ) ( ) and OR ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) and Interpretation: The average deviation of the class sizes from the average class size (44) is 8 students. Coefficient of Variation It s a relative measure of variability It measures the standard deviation relative to the mean Coefficient of Variation: The coefficient of variation tells us that the sample standard deviation is a % of the value of the sample mean. Copyright Reserved 9
10 10 Example: The class test mark (out of 10) and the semester test mark (out of 50) of 5 students are investigated. Class test (out of 10) Semester test (out of 50) Average of class test marks = 6 Average of semester test marks = 26 Variance of class test marks = 2.5 Variance of semester test marks = Which test has the biggest relative variation? Calculate the relevant numerical measures. Coefficient of variation for the class test marks: Coefficient of variation for the semester test marks: Therefore, the semester test has the biggest relative variation. Using Microsoft Excel s 2007 Descriptive Statistics Tool Self-study (see page 115) 3.3 Measures of Distribution Shape, Relative Location and Detecting Outliers Distribution Shapes Read through by yourself. z- Scores z - Scores: The z -score is called the standardized value. It can be interpreted as the number of standard deviations x is from the mean. Copyright Reserved 10
11 11 Example: z -scores of the class sizes dataset. (We calculated the mean and standard deviation previously: and s = 8). Number of students in class ( ) Deviation about the mean ( ) z-score ( ) Interpretation: 54 is 1.25 standard deviations above the mean. 32 is 1.5 standard deviation below the mean. Example: The Mathematics marks of 2 students are compared. Student 1 75% (in School A) Student 2 80% (in School B) Which one has done the best, relatively to his school? Student 1: School s A B Student 1 s mark is 2.5 standard deviations above the mean. Student 2: Student 2 s mark is exactly the same value as the mean. Conclusion: Student 1 has done relatively better in his school than Student 2. Copyright Reserved 11
12 12 Chebyshev s Theorem Not for examination Empirical Rule Empirical Rule: 68% of the data values will be within 1 std dev of. 95% of the data values will be within 2 std dev of. 100% of the data values will be within 3 std dev of. Copyright Reserved 12
13 13 Example of the application of the empirical rule: Suppose IQ scores have a bell-shaped distribution with a mean of 100 and a standard deviation of 15. a) What percentage of people should have an IQ score between 85 and 115? Answer = 68% b) What percentage of people should have an IQ score between 70 and 130? Answer = 95% c) What percentage of people should have an IQ score of more than 130? Answer = 2.5% 100% - 95% = 5% and = 2.5% Copyright Reserved 13
14 14 d) The 16 th percentile ( ) is equal to: 100% - 68% = 32% and = 16%. Therefore, P 16 = 85. e) The 84 th ( ) percentile is equal to: 16% + 68% = 84%. P 84 = 115 f) Is a person with an IQ score of 160 seen as an outlier? Yes, since approximately 100% of the values are between 55 and 145, an IQ score of 160 is seen as an outlier. OR > 3 (see the next Section on outliers). Copyright Reserved 14
15 15 Detecting Outliers Sometimes a data set will have one or more observations with unusually large or unusually small values. Extreme values are called outliers. Standardized values (z-scores) can be used to identify outliers. In the case of a bell-shaped distribution, the following rule can be applied: Since 100% of the data will be within 3 std dev of the mean, we recommend treating any data value with a (z-score <-3) OR a (z score >3) as an outlier. 3.4 Exploratory Data Analysis Five-Number Summary The following 5 numbers are used to summarize the data: 1. Smallest Value 2. First Quartile ( ) 3. Second Quartile ( ) 4. Third Quartile ( ) 5. Largest Value The five-number summary of the salary data is: Smallest value = 3310 Largest value = 3925 (Median) (These values have been calculated previously). Copyright Reserved 15
16 16 Box Plot A box plot is a graphical summary of data that is based on a five-number summary. A box plot provides another way to identify outliers. Upper limit = Q 3 + (1.5)(IQR) = (1.5)(135) = Lower limit = Q 1 - (1.5)(IQR) = (1.5)(135) = If a point falls above the upper limit or below the lower limit, the point is seen as an outlier. Copyright Reserved 16
17 17 Box-plots and skewness: The median is in the middle of the box, indicating symmetry. The median is not centered in the middle of the box. The median is closer to, indicating that the shape of the distribution is skewed to the right. The median is not centered in the middle of the box. The median is closer to, indicating that the shape of the distribution is skewed to the left. Skewness: Skewed to the left (negative skew): The left tail is longer; the mass of the distribution is concentrated on the right of the figure. It has relatively few low values. Skewed to the right (positive skew): The right tail is longer; the mass of the distribution is concentrated on the left of the figure. It has relatively few high values. Symmetric Note: A normal distributions is symmetric Copyright Reserved 17
18 Measures of association between two variables Covariance Sample Covariance: Measure of the linear relationship between x and y. ( )( ) Note: Positive linear relationship Negative linear relationship No linear relationship Note: (Not in the textbook) ( )( ) ( ) where denotes the sample variance of the x observations. Similarly: ( )( ) ( ) where denotes the sample variance of the y observations. Calculations for the variance and standard deviation of x, the variance and standard deviation of y and the covariance between x and y: x y ( ) ( ) ( ) ( ) ( )( ) and Copyright Reserved 18
19 19 1. Calculate the variance and the standard deviation of x: ( ) and 2. Calculate the variance and the standard deviation of y: ( ) and 3. Calculate and interpret the covariance between x and y: ( )( ). There is a positive linear relationship between x and y. Copyright Reserved 19
20 y y 20 Interpretation of sample covariance 25 A positive linear relationship x 25 A negative linear relationship x Correlation Coefficient To measure the strength of the linear relationship between x and y. ( )( ) Strong positive linear relationship between x and y. where Sample covariance between x and y. Sample standard deviation of x. Sample standard deviation of y. Copyright Reserved 20
21 21 Interpretation of the Correlation Coefficient Measures the linear relationship between x and y i. Positive linear relationship Perfect positive linear relationship ii. Negative linear relationship Perfect negative linear relationship iii. Non-linear relationship Strong negative linear relationship between x and y Weak negative linear relationship between x and y Weak positive linear relationship between x and y Strong positive linear relationship between x and y No linear relationship between x and y Copyright Reserved 21
22 22 Using Microsoft Excel 2007 to compute the covariance and correlation coefficient Formula worksheet: Value worksheet: Note: We have to adjust the Excel result of 9.9 for the covariance, since the COVAR function in Excel calculates the population covariance. = sample covariance = population covariance ( ) ( ) Copyright Reserved 22
23 23 Homework (work through the following example on your own): The class test mark (out of 10) (x) and the semester test mark (out of 50) (y) of 5 students are investigated. Class test (out of 10) (x) Semester test (out of 50) (y) (a) Calculate the mean mark and the variance for the class test: and ( ) ( ) ( ) ( ) ( ) ( ). (b) Calculate the mean mark and the variance for the semester test: and ( ) ( ) ( ) ( ) ( ) ( ). (c) Calculate and interpret the standard deviation for the semester test:. The average deviation of the semester test marks from the average ( ) is (d) Calculate and interpret the covariance: Answer: ( )( ) x y ( ) ( ) ( )( ) There is a positive linear relationship between x and y. (e) Calculate and interpret the correlation coefficient:. There is a strong positive linear relationship between x and y. (f) Suppose a student obtained 6/10 for the class test and 30/50 for the semester test. In which test did the student perform the best, relative to the other students? and test, relative to the other students.. The student performed the best in the semester Copyright Reserved 23
24 The weighted mean and working with grouped data Weighted Mean Example Consider the following sample of 5 purchases of raw material Purchase Cost per pound ($) Number of pounds Question: The mean cost per pound for the raw material? The weighted mean: ( )( ) ( )( ) ( )( ) ( )( ) ( )( ) Example: The net full supply capacity (FSC) (in millions of cubic metres) in the various regions and catchment areas in South Africa, and also the percentage content as on 31 August 1992 are given in the table below. Region/catchment area FSC % content Vaaldam Bloemhofdam Sterkfonteindam Question: Calculate the weighted mean for the % content in the catchment area: ( )( ) ( )( ) ( )( ) Copyright Reserved 24
25 25 Grouped data The audit times for 20 clients were as follows: Audit times (in days) Frequency Class Midpoint Sample mean for grouped data: The midpoint for class i The frequency for class i ( )( ) ( )( ) ( )( ) ( )( ) ( )( ) Sample variance for grouped data: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) = 30 The standard deviation: Copyright Reserved 25
26 26 Homework (go through this example on your own) Automobiles traveling on a road that has a posted speed limit of 55 miles per hour are checked for speed by a state police radar system. Following is a frequency distribution of speeds. Speed (miles per hour) (a) Calculate the average speed of the automobiles. (b) Calculate the variance and the standard deviation ( ) Copyright Reserved 26
27 27 Typical exam questions: The annual amounts (in $ millions) spent on research and development for a random sample of 30 electronic component manufacturers are given in the following Excel spreadsheets. By using the Sort-option in Excel the data set is sorted according to the amount spent. Unsorted Sorted Annual amounts (in $ millions) for electronic component manufacturers has a bell-shaped distribution with a mean of 20 and a standard deviation of 7. Question 1 The range is: Answer 1 Range = x max x min = 38 6 = 32. Question 2 The median is: Answer 2 ( ) ( ). We need to take the average of the values in the 15 th and 16 th positions. In position 15 we have 20 and in position 16 we have 20, therefore. Question 3: The data type of annual amounts is: Answer 3: Continuous Question 4 According to the coefficient of variation: Answer 4. The standard deviation is 35% of the average. Copyright Reserved 27
28 28 Questions 5 to 8 are based on the following information: The relationship between the age (in years) of a motorist and the speed (in km/h) of the car on the highway is summarised in the following Excel spreadsheet: Formula sheet: Value sheet: Question 5 The variance of the age of the motorists is: Answer 5 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) Question 6 The coefficient of variation of the age of the motorists is: Answer 6 Copyright Reserved 28
29 29 Question 7 The sample covariance is: Answer 7 Sample covariance = Population covariance Question 8 The relationship between the age of a motorist and the speed of the car on the highway can be described as: (A) (B) (C) (D) (E) no linear relationship a strong negative linear relationship a weak negative linear relationship a strong positive linear relationship a weak positive linear relationship Answer 8 r = which is close to -1. Consequently, we have a strong negative linear relationship. Questions 9 to 11 are based on the following information: Consider the following set of Descriptive Statistics on time per week (in hours) spent on campaigning for the upcoming general election for a specific political party: Descriptive statistic Value Smallest value 8 Largest value 36 Question 9 The distribution of time per week (in hours) is: (A) Bimodal (C) Symmetrical (E) Skewed to the left (B) Multimodal (D) Skewed to the right Answer 9 Q 1 and Q 3 are equally far away from the median, therefore, the distribution is symmetrical. The boxplot, for example, will look something like this: The median is in the middle of the box, indicating symmetry. Copyright Reserved 29
30 30 Question 10 Using the box and whisker plot approach, an outlier is a value greater than: Answer 10 ( ) ( ). Question 11 The z-score (standardised value) for the largest value in the data set is: Answer 11 Copyright Reserved 30
3.1 Measure of Center
3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationObjective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.
Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The
More informationMATH 117 Statistical Methods for Management I Chapter Three
Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.
More informationADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes
We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures
More informationUnit Two Descriptive Biostatistics. Dr Mahmoud Alhussami
Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are
More informationUnit 2. Describing Data: Numerical
Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationLecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #
Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures
More informationRange The range is the simplest of the three measures and is defined now.
Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.
More informationTOPIC: Descriptive Statistics Single Variable
TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency
More informationChapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved
Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data
More informationCHAPTER 1. Introduction
CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing
More informationSection 3.2 Measures of Central Tendency
Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a
More informationSlide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.
Slide 1 Slide 2 Daphne Phillip Kathy Slide 3 Pick a Brick 100 pts 200 pts 500 pts 300 pts 400 pts 200 pts 300 pts 500 pts 100 pts 300 pts 400 pts 100 pts 400 pts 100 pts 200 pts 500 pts 100 pts 400 pts
More information3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data Objectives 1. Approximate the mean of a variable from grouped data 2. Compute the weighted mean 3. Approximate the standard deviation
More informationLecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,
More informationare the objects described by a set of data. They may be people, animals or things.
( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms
More informationExample 2. Given the data below, complete the chart:
Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is
More informationChapter 1 - Lecture 3 Measures of Location
Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What
More informationSection 3. Measures of Variation
Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The
More information2.1 Measures of Location (P.9-11)
MATH1015 Biostatistics Week.1 Measures of Location (P.9-11).1.1 Summation Notation Suppose that we observe n values from an experiment. This collection (or set) of n values is called a sample. Let x 1
More informationMath 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency
Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:
More information1.3: Describing Quantitative Data with Numbers
1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with
More informationChapter 3 Data Description
Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average
More informationMgtOp 215 Chapter 3 Dr. Ahn
MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):
More informationReview for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data
Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature
More information1 Measures of the Center of a Distribution
1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects
More informationDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe
More informationElementary Statistics
Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:
More informationLecture 2 and Lecture 3
Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.
More informationFurther Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data
Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationCIVL 7012/8012. Collection and Analysis of Information
CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real
More informationSTT 315 This lecture is based on Chapter 2 of the textbook.
STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationUnit 2: Numerical Descriptive Measures
Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48
More informationChapter 3. Data Description. McGraw-Hill, Bluman, 7 th ed, Chapter 3
Chapter 3 Data Description McGraw-Hill, Bluman, 7 th ed, Chapter 3 1 Chapter 3 Overview Introduction 3-1 Measures of Central Tendency 3-2 Measures of Variation 3-3 Measures of Position 3-4 Exploratory
More informationExam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Exam: practice test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. ) Using the information in the table on home sale prices in
More informationA is one of the categories into which qualitative data can be classified.
Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The
More informationMath 140 Introductory Statistics
Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The
More informationDescribing Distributions with Numbers
Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)
More informationAverages How difficult is QM1? What is the average mark? Week 1b, Lecture 2
Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2 Topics: 1. Mean 2. Mode 3. Median 4. Order Statistics 5. Minimum, Maximum, Range 6. Percentiles, Quartiles, Interquartile Range
More informationCHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring
More informationLecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.
More informationSTAT 200 Chapter 1 Looking at Data - Distributions
STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the
More informationChapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.
Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 The breakfast cereal data Study collected data on nutritional
More informationChapter 2: Tools for Exploring Univariate Data
Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1
More informationStats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16
Stats Review Chapter Revised 8/16 Note: This review is composed of questions similar to those found in the chapter review and/or chapter test. This review is meant to highlight basic concepts from the
More informationInstructor: Doug Ensley Course: MAT Applied Statistics - Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 04 - Sections 2.5 and 2.6 1. A travel magazine recently presented data on the annual number of vacation
More information(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)
3. Descriptive Statistics Describing data with tables and graphs (quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) Bivariate descriptions
More informationMidrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used
Measures of Central Tendency Mode: most frequent score. best average for nominal data sometimes none or multiple modes in a sample bimodal or multimodal distributions indicate several groups included in
More informationRecap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations
DAY 4 16 Jan 2014 Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations Two Important Three-Standard-Deviation Rules 1. Chebychev s Rule : Implies that at least 89% of the observations
More informationMeasures of center. The mean The mean of a distribution is the arithmetic average of the observations:
Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number
More informationChapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation
Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient
More informationPractice problems from chapters 2 and 3
Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,
More informationHistograms allow a visual interpretation
Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called
More informationWhat is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected
What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types
More informationAfter completing this chapter, you should be able to:
Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard
More informationLecture 3: Chapter 3
Lecture 3: Chapter 3 C C Moxley UAB Mathematics 26 January 16 3.2 Measurements of Center Statistics involves describing data sets and inferring things about them. The first step in understanding a set
More informationLecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:
Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots
More informationStat 101 Exam 1 Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative
More informationSections 6.1 and 6.2: The Normal Distribution and its Applications
Sections 6.1 and 6.2: The Normal Distribution and its Applications Definition: A normal distribution is a continuous, symmetric, bell-shaped distribution of a variable. The equation for the normal distribution
More informationTopic-1 Describing Data with Numerical Measures
Topic-1 Describing Data with Numerical Measures Central Tendency (Center) and Dispersion (Variability) Central tendency: measures of the degree to which scores are clustered around the mean of a distribution
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationChapter 2: Descriptive Analysis and Presentation of Single- Variable Data
Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness
More informationIntroduction to Statistics
Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,
More informationThe empirical ( ) rule
The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%
More information3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability
3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3.1 Week 1 Review Creativity is more than just being different. Anybody can plan weird; that s easy. What s hard is to be
More informationLecture 2. Descriptive Statistics: Measures of Center
Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences
More informationDescribing Center: Mean and Median Section 5.4
Describing Center: Mean and Median Section 5.4 Look at table 5.2 at the right. We are going to make the dotplot of the city gas mileages of midsize cars. How to describe the center of a distribution: x
More informationReview: Central Measures
Review: Central Measures Mean, Median and Mode When do we use mean or median? If there is (are) outliers, use Median If there is no outlier, use Mean. Example: For a data 1, 1.2, 1.5, 1.7, 1.8, 1.9, 2.3,
More informationLC OL - Statistics. Types of Data
LC OL - Statistics Types of Data Question 1 Characterise each of the following variables as numerical or categorical. In each case, list any three possible values for the variable. (i) Eye colours in a
More informationF78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives
F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested
More informationMeasures of the Location of the Data
Measures of the Location of the Data 1. 5. Mark has 51 films in his collection. Each movie comes with a rating on a scale from 0.0 to 10.0. The following table displays the ratings of the aforementioned
More informationLecture 11. Data Description Estimation
Lecture 11 Data Description Estimation Measures of Central Tendency (continued, see last lecture) Sample mean, population mean Sample mean for frequency distributions The median The mode The midrange 3-22
More informationCHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.
(c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals
More information= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty
Chapter Slides Inferential Statistics and Probability a Holistic Approach Chapter Descriptive Statistics This Course Material by Maurice Geraghty is licensed under a Creative Commons Attribution-ShareAlike.
More informationChapter 1:Descriptive statistics
Slide 1.1 Chapter 1:Descriptive statistics Descriptive statistics summarises a mass of information. We may use graphical and/or numerical methods Examples of the former are the bar chart and XY chart,
More informationIB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks
IB Questionbank Mathematical Studies 3rd edition Grouped discrete 184 min 183 marks 1. The weights in kg, of 80 adult males, were collected and are summarized in the box and whisker plot shown below. Write
More informationChapter 5: Exploring Data: Distributions Lesson Plan
Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The
More informationDescribing Data: Numerical Measures
Describing Data: Numerical Measures Chapter 3 Learning Objectives Calculate the arithmetic mean, weighted mean, geometric mean, median, and the mode. Explain the characteristics, uses, advantages, and
More informationThe Empirical Rule, z-scores, and the Rare Event Approach
Overview The Empirical Rule, z-scores, and the Rare Event Approach Look at Chebyshev s Rule and the Empirical Rule Explore some applications of the Empirical Rule How to calculate and use z-scores Introducing
More informationFinding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data
Finding Quartiles. Use the median to divide the ordered data set into two halves.. If n is odd, do not include the median in either half. If n is even, split this data set exactly in half.. Q1 is the median
More informationExploring and describing data
10 Exploring and describing data Syllabus topic S1.2 Exploring and describing data arising from a single continuous variable This topic will develop your skills in calculating summary statistics for single
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More informationQUIZ 1 (CHAPTERS 1-4) SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%
QUIZ 1 (CHAPTERS 1-4) SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100% 1) (6 points). A college has 32 course sections in math. A frequency table for the numbers of students
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More informationChapters 1 & 2 Exam Review
Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the
More informationChapter 6 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread
More information2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS
Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics
More informationDESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS Statistics deals with the theories and methods used in the collection, organization, interpretation and presentation of data. Data raw material used in statistical investigation
More informationSection 2.4. Measuring Spread. How Can We Describe the Spread of Quantitative Data? Review: Central Measures
mean median mode Review: entral Measures Mean, Median and Mode When do we use mean or median? If there is (are) outliers, use Median If there is no outlier, use Mean. Example: For a data 1, 1., 1.5, 1.7,
More information