2011 Pearson Education, Inc

Size: px
Start display at page:

Download "2011 Pearson Education, Inc"

Transcription

1

2 Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data

3 Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value Position 2 When Ordered Mode none Most Frequent

4 Shape 1. Describes how data are distributed 2. Measures of Shape Skew = Symmetry Left-Skewed Mean Median Symmetric Mean = Median Right-Skewed Median Mean

5 2.5 Numerical Measures of Variability

6 Range 1. Measure of dispersion 2. Difference between largest & smallest observations Range = x largest x smallest 3. Ignores how data are distributed Range = 10 7 = 3 Range = 10 7 = 3

7 Variance & Standard Deviation 1. Measures of dispersion 2. Most common measures 3. Consider how data are distributed 4. Show variation about mean (x or μ) x =

8 Standard Notation Measure Sample Population Mean Standard Deviation x s Variance s 2 2 Size n N

9 Sample Variance Formula n x i x 2 s 2 i 1 n 1 x 1 x 2 x 2 x 2 L x n x 2 n 1 n 1 in denominator!

10 Sample Standard Deviation Formula s s 2 n i 1 x i x 2 n 1 x 1 x 2 x 2 x 2 L x n x 2 n 1

11 Variance Example Raw Data: s s 2 2 i n ( x i x ) 1 i 1 n 1 2 where x 8. 3 n ( ) ( ) 8. 3 ( 7. 7 ) n x i

12 Thinking Challenge You re a financial analyst for Prudential-Bache Securities. You have collected the following closing stock prices of new stock issues: 17, 16, 21, 18, 13, 16, 12, 11. What are the variance and standard deviation of the stock prices?

13 Sample Variance Variation Solution* Raw Data: s s 2 2 n ( x x ) 2 i x i i 1 i 1 where x n 1 n ( 17 ) ( ) ( ) n

14 Variation Solution* Sample Standard Deviation x i x 2 s s 2 i 1 n 1 n

15 Summary of Variation Measures Measure Formula Description Range X largest X smallest Total Spread Standard Deviation (Sample) n i 1 x i x 2 n 1 Dispersion about Sample Mean Standard Deviation (Population) Variance (Sample) n i 1 n i 1 x i µ x 2 N x i x 2 n 1 Dispersion about Population Mean Squared Dispersion about Sample Mean

16 2.6 Interpreting the Standard Deviation

17 Interpreting Standard Deviation: Chebyshev s Theorem Applies to any shape data set No useful information about the fraction of data in the interval x s to x + s At least 3/4 of the data lies in the interval x 2s to x + 2s At least 8/9 of the data lies in the interval x 3s to x + 3s In general, for k > 1, at least 1 1/k 2 of the data lies in the interval x ks to x + ks

18 Interpreting Standard Deviation: Chebyshev s Theorem x 3s x 2s x s x x s x 2s x 3s No useful information At least 3/4 of the data At least 8/9 of the data

19 Chebyshev s Theorem Example Previously we found the mean closing stock price of new stock issues is 15.5 and the standard deviation is Use this information to form an interval that will contain at least 75% of the closing stock prices of new stock issues.

20 Chebyshev s Theorem Example At least 75% of the closing stock prices of new stock issues will lie within 2 standard deviations of the mean. x = 15.5 s = 3.34 (x 2s, x + 2s) = ( , ) = (8.82, 22.18)

21 Interpreting Standard Deviation: Empirical Rule Applies to data sets that are mound shaped and symmetric Approximately 68% of the measurements lie in the interval x s to x s Approximately 95% of the measurements lie in the interval x 2s to x 2s Approximately 99.7% of the measurements lie in the interval x 3s to x 3s

22 Interpreting Standard Deviation: Empirical Rule x 3s x 2s x s x x + s x +2s x + 3s Approximately 68% of the measurements Approximately 95% of the measurements Approximately 99.7% of the measurements

23 Empirical Rule Example Previously we found the mean closing stock price of new stock issues is 15.5 and the standard deviation is If we can assume the data is symmetric and mound shaped, calculate the percentage of the data that lie within the intervals x + s, x + 2s, x + 3s.

24 Empirical Rule Example According to the Empirical Rule, approximately 68% of the data will lie in the interval (x s, x + s), ( , ) = (12.16, 18.84) Approximately 95% of the data will lie in the interval (x 2s, x + 2s), ( , ) = (8.82, 22.18) Approximately 99.7% of the data will lie in the interval (x 3s, x + 3s), ( , ) = (5.48, 25.52)

25 2.7 Numerical Measures of Relative Standing

26 Numerical Measures of Relative Standing: Percentiles Describes the relative location of a measurement compared to the rest of the data The p th percentile is a number such that p% of the data falls below it and (100 p)% falls above it Median = 50 th percentile

27 Percentile Example You scored 560 on the GMAT exam. This score puts you in the 58 th percentile. What percentage of test takers scored lower than you did? What percentage of test takers scored higher than you did?

28 Percentile Example What percentage of test takers scored lower than you did? 58% of test takers scored lower than 560. What percentage of test takers scored higher than you did? (100 58)% = 42% of test takers scored higher than 560.

29 Numerical Measures of Relative Standing: z Scores Describes the relative location of a measurement compared to the rest of the data Sample z score z x s x Population z score Measures the number of standard deviations away from the mean a data value is located z x µ

30 Z Score Example The mean time to assemble a product is 22.5 minutes with a standard deviation of 2.5 minutes. Find the z score for an item that took 20 minutes to assemble. Find the z score for an item that took 27.5 minutes to assemble.

31 Z Score Example x = 20, μ = 22.5 σ = 2.5 z = x μ σ = 2.5 = 1.0 x = 27.5, μ = 22.5 σ = 2.5 z = x μ σ = 2.5 = 2.0

32 Interpretation of z Scores for Mound-Shaped Distributions of Data 1. Approximately 68% of the measurements will have a z-score between 1 and Approximately 95% of the measurements will have a z-score between 2 and Approximately 99.7% of the measurements will have a z-score between 3 and 3. (see the figure on the next slide)

33 Interpretation of z Scores

34 2.8 Methods for Detecting Outliers: Box Plots and z-scores

35 Outlier An observation (or measurement) that is unusually large or small relative to the other values in a data set is called an outlier. Outliers typically are attributable to one of the following causes: 1. The measurement is observed, recorded, or entered into the computer incorrectly. 2. The measurement comes from a different population. 3. The measurement is correct but represents a rare (chance) event.

36 Quartiles Measure of noncentral tendency Split ordered data into 4 quarters 25% 25% 25% 25% Q 1 Q 2 Q 3 Lower quartile Q L is 25 th percentile. Middle quartile m is the median. Upper quartile Q U is 75 th percentile. Interquartile range: IQR = Q U Q L

37 Quartile (Q 2 ) Example Raw Data: Ordered: Position: Q 2 is the median, the average of the two middle scores ( )/2 = 8.8

38 Quartile (Q 1 ) Example Raw Data: Ordered: Position: Q L or Q 1 is median of bottom half = 6.3

39 Quartile (Q 3 ) Example Raw Data: Ordered: Position: Q U or Q 3 is median of bottom half = 10.3

40 Interquartile Range 1. Measure of dispersion 2. Also called midspread 3. Difference between third & first quartiles Interquartile Range = Q 3 Q 1 4. Spread in middle 50% 5. Not affected by extreme values

41 Thinking Challenge You re a financial analyst for Prudential-Bache Securities. You have collected the following closing stock prices of new stock issues: 17, 16, 21, 18, 13, 16, 12, 11. What are the quartiles, Q 1 and Q 3, and the interquartile range?

42 Quartile Solution* Q 1 : Raw Data: Ordered: Position: Q 1 is the median of the bottom half, the average of the two middle scores ( )/2 = 12.5

43 Quartile Solution* Q 3 : Raw Data: Ordered: Position: Q 3 is the median of the bottom half, the average of the two middle scores ( )/2 = 17.5

44 Interquartile Range Solution* Interquartile Range: Raw Data: Ordered: Position: Interquartile Range = Q 3 Q 1 = = 5

45 Box Plot 1. Graphical display of data using 5-number summary X smallest Q 1 Median Q 3 X largest

46 Box Plot 1. Draw a rectangle (box) with the ends (hinges) drawn at the lower and upper quartiles (Q L and Q U ). The median data is shown by a line or symbol (such as + ). 2. The points at distances 1.5(IQR) from each hinge define the inner fences of the data set. Line (whiskers) are drawn from each hinge to the most extreme measurements inside the inner fence.

47 Box Plot 3. A second pair of fences, the outer fences, are defined at a distance of 3(IQR) from the hinges. One symbol (*) represents measurements falling between the inner and outer fences, and another (0) represents measurements beyond the outer fences. 4. Symbols that represent the median and extreme data points vary depending on software used. You may use your own symbols if you are constructing a box plot by hand.

48 Definition: Boxplot A boxplot is a graph of lines (from lowest point inside the lower inner fence to highest point in the upper inner fence) and boxes (from Lower Quartile to Upper quartile) indicating the position of the median. * Outliers Lowest data Point more than the lower inner fence Lower Quartile Median Upper Quartile Highest data Point less than the upper inner fence 48

49 Suspected outliers and highly Suspected Outliers lie suspected Above 1.5 IQRs but below 3 IQRs from the Upper Quartile Below 1.5 IQRs but above 3 IQRs from the Lower Quartile Highly Suspected Outliers lie Above 3 IQRs from the Upper Quartile Below 3 IQRs from the Lower Quartile. 49

50 Example - Fax 28 data points: 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 5, 5, 5, 5, 6, 6, 6, 8, 10, 13, 25, 37 Min=1, Q L = 2, M = 3, Q U = 6, Max = 37. IQR=6-2=4. Inner Fence extremes: Q IQR=-4, Q IQR =12 Outer Fence extremes: Q 1-3IQR=-10, Q 3 +3 IQR=18 Suspected Outliers: 13 Highly Suspected Outliers: 25, 37 50

51 SPSS Use Analyze/Descriptive Statistics/Explore Under Plots, make certain Box plots is selected 51

52 Shape & Box Plot Left-Skewed Symmetric Right-Skewed Q 1 Median Q 3 Q 1 Median Q 3 Q 1 Median Q 3

53 Detecting Outliers Box Plots: Observations falling between the inner and outer fences are deemed suspect outliers. Observations falling beyond the outer fence are deemed highly suspect outliers. z-scores: Observations with z-scores greater than 3 in absolute value are considered outliers. (For some highly skewed data sets, observations with z-scores greater than 2 in absolute value may be outliers.)

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Unit 2: Numerical Descriptive Measures

Unit 2: Numerical Descriptive Measures Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics By A.V. Vedpuriswar October 2, 2016 Introduction The word Statistics is derived from the Italian word stato, which means state. Statista refers to a person involved with the

More information

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3.1 Week 1 Review Creativity is more than just being different. Anybody can plan weird; that s easy. What s hard is to be

More information

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data Finding Quartiles. Use the median to divide the ordered data set into two halves.. If n is odd, do not include the median in either half. If n is even, split this data set exactly in half.. Q1 is the median

More information

equal to the of the. Sample variance: Population variance: **The sample variance is an unbiased estimator of the

equal to the of the. Sample variance: Population variance: **The sample variance is an unbiased estimator of the DEFINITION The variance (aka dispersion aka spread) of a set of values is a measure of equal to the of the. Sample variance: s Population variance: **The sample variance is an unbiased estimator of the

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty

= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty Chapter Slides Inferential Statistics and Probability a Holistic Approach Chapter Descriptive Statistics This Course Material by Maurice Geraghty is licensed under a Creative Commons Attribution-ShareAlike.

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Range The range is the simplest of the three measures and is defined now.

Range The range is the simplest of the three measures and is defined now. Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.

More information

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts. Slide 1 Slide 2 Daphne Phillip Kathy Slide 3 Pick a Brick 100 pts 200 pts 500 pts 300 pts 400 pts 200 pts 300 pts 500 pts 100 pts 300 pts 400 pts 100 pts 400 pts 100 pts 200 pts 500 pts 100 pts 400 pts

More information

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16 Stats Review Chapter Revised 8/16 Note: This review is composed of questions similar to those found in the chapter review and/or chapter test. This review is meant to highlight basic concepts from the

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

After completing this chapter, you should be able to:

After completing this chapter, you should be able to: Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Review: Central Measures

Review: Central Measures Review: Central Measures Mean, Median and Mode When do we use mean or median? If there is (are) outliers, use Median If there is no outlier, use Mean. Example: For a data 1, 1.2, 1.5, 1.7, 1.8, 1.9, 2.3,

More information

Lecture 11. Data Description Estimation

Lecture 11. Data Description Estimation Lecture 11 Data Description Estimation Measures of Central Tendency (continued, see last lecture) Sample mean, population mean Sample mean for frequency distributions The median The mode The midrange 3-22

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Numerical Measures of Central Tendency

Numerical Measures of Central Tendency ҧ Numerical Measures of Central Tendency The central tendency of the set of measurements that is, the tendency of the data to cluster, or center, about certain numerical values; usually the Mean, Median

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population . Measures of Central Tendency: Mode, Median and Mean Average a single number that is used to describe the entire sample or population. Mode a. Easiest to compute, but not too stable i. Changing just one

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Determining the Spread of a Distribution Variance & Standard Deviation

Determining the Spread of a Distribution Variance & Standard Deviation Determining the Spread of a Distribution Variance & Standard Deviation 1.3 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3 Lecture 3 1 / 32 Outline 1 Describing

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Section 2.4. Measuring Spread. How Can We Describe the Spread of Quantitative Data? Review: Central Measures

Section 2.4. Measuring Spread. How Can We Describe the Spread of Quantitative Data? Review: Central Measures mean median mode Review: entral Measures Mean, Median and Mode When do we use mean or median? If there is (are) outliers, use Median If there is no outlier, use Mean. Example: For a data 1, 1., 1.5, 1.7,

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 3.1-1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

MEASURING THE SPREAD OF DATA: 6F

MEASURING THE SPREAD OF DATA: 6F CONTINUING WITH DESCRIPTIVE STATS 6E,6F,6G,6H,6I MEASURING THE SPREAD OF DATA: 6F othink about this example: Suppose you are at a high school football game and you sample 40 people from the student section

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Topic-1 Describing Data with Numerical Measures

Topic-1 Describing Data with Numerical Measures Topic-1 Describing Data with Numerical Measures Central Tendency (Center) and Dispersion (Variability) Central tendency: measures of the degree to which scores are clustered around the mean of a distribution

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used

Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used Measures of Central Tendency Mode: most frequent score. best average for nominal data sometimes none or multiple modes in a sample bimodal or multimodal distributions indicate several groups included in

More information

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange. CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,

More information

Measures of Location. Measures of position are used to describe the relative location of an observation

Measures of Location. Measures of position are used to describe the relative location of an observation Measures of Location Measures of position are used to describe the relative location of an observation 1 Measures of Position Quartiles and percentiles are two of the most popular measures of position

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Quantitative Tools for Research

Quantitative Tools for Research Quantitative Tools for Research KASHIF QADRI Descriptive Analysis Lecture Week 4 1 Overview Measurement of Central Tendency / Location Mean, Median & Mode Quantiles (Quartiles, Deciles, Percentiles) Measurement

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study CHAPTER PROBLEM Do women really talk more than men? A common belief is that women talk more than men. Is that belief founded in fact, or is it a myth? Do men actually talk more than women? Or do men and

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Population - all items of interest for a particular decision

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations DAY 4 16 Jan 2014 Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations Two Important Three-Standard-Deviation Rules 1. Chebychev s Rule : Implies that at least 89% of the observations

More information

The Empirical Rule, z-scores, and the Rare Event Approach

The Empirical Rule, z-scores, and the Rare Event Approach Overview The Empirical Rule, z-scores, and the Rare Event Approach Look at Chebyshev s Rule and the Empirical Rule Explore some applications of the Empirical Rule How to calculate and use z-scores Introducing

More information

2.1 Measures of Location (P.9-11)

2.1 Measures of Location (P.9-11) MATH1015 Biostatistics Week.1 Measures of Location (P.9-11).1.1 Summation Notation Suppose that we observe n values from an experiment. This collection (or set) of n values is called a sample. Let x 1

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Chapter 6 Group Activity - SOLUTIONS

Chapter 6 Group Activity - SOLUTIONS Chapter 6 Group Activity - SOLUTIONS Group Activity Summarizing a Distribution 1. The following data are the number of credit hours taken by Math 105 students during a summer term. You will be analyzing

More information

Measures of the Location of the Data

Measures of the Location of the Data Measures of the Location of the Data 1. 5. Mark has 51 films in his collection. Each movie comes with a rating on a scale from 0.0 to 10.0. The following table displays the ratings of the aforementioned

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR

200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR Ana Jerončić 200 participants [EUR] about half (71+37=108) 200 = 54% of the bills are small, i.e. less than 30 EUR (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR

More information

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest: 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number

More information

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful

More information

8/28/2017. PSY 5101: Advanced Statistics for Psychological and Behavioral Research 1

8/28/2017. PSY 5101: Advanced Statistics for Psychological and Behavioral Research 1 PSY 5101: Advanced Statistics for Psychological and Behavioral Research 1 Aspects or characteristics of data that we can describe are Central Tendency (or Middle) Dispersion (or Spread) ness Kurtosis Statistics

More information