Descriptive Statistics Example

Similar documents
STATISTICS. 1. Measures of Central Tendency

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

CIVL 7012/8012. Collection and Analysis of Information

Algebra Calculator Skills Inventory Solutions

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

A is one of the categories into which qualitative data can be classified.

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

Chapter 3. Data Description

West Windsor-Plainsboro Regional School District Algebra Grade 8

Descriptive Statistics

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

After completing this chapter, you should be able to:

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Lecture 1: Descriptive Statistics

1. Exploratory Data Analysis

STAT 200 Chapter 1 Looking at Data - Distributions

CHAPTER 1. Introduction

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Instrumentation (cont.) Statistics vs. Parameters. Descriptive Statistics. Types of Numerical Data

Vocabulary: Data About Us

MATHEMATICS AND STATISTICS

Introduction to Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

additionalmathematicsstatisticsadditi onalmathematicsstatisticsadditionalm athematicsstatisticsadditionalmathem aticsstatisticsadditionalmathematicsst

SESSION 5 Descriptive Statistics

The Empirical Rule, z-scores, and the Rare Event Approach

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

(a) Find the value of x. (4) Write down the standard deviation. (2) (Total 6 marks)

Describing distributions with numbers

PhysicsAndMathsTutor.com

MATHEMATICS Grade 7 Standard: Number, Number Sense and Operations. Organizing Topic Benchmark Indicator Number and Number Systems

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

Eruptions of the Old Faithful geyser

CK-12 Middle School Math Grade 8

Learning Objectives for Stat 225

KIST DP Course Descriptions

2011 Pearson Education, Inc

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

Histograms allow a visual interpretation

Correlation of Moving with Algebra Grade 7 To Ohio Academic Content Standards

Histogram, cumulative frequency, frequency, 676 Horizontal number line, 6 Hypotenuse, 263, 301, 307

Expressions and Formulas 1.1. Please Excuse My Dear Aunt Sally

PubHlth 540 Fall Summarizing Data Page 1 of 18. Unit 1 - Summarizing Data Practice Problems. Solutions

Probabilities and Statistics Probabilities and Statistics Probabilities and Statistics

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Descriptive Data Summarization

TOPIC: Descriptive Statistics Single Variable

Statistics 511 Additional Materials

Descriptive Statistics Class Practice [133 marks]

Contents. Acknowledgments. xix

Revised: 2/19/09 Unit 1 Pre-Algebra Concepts and Operations Review

Statistics in medicine

Florida Department of Education Sunshine State Standards Mathematics and FCAT Benchmarks Grades 1 8. FOCUS on Mathematics Series

Chapter 4. Displaying and Summarizing. Quantitative Data

AP Statistics Cumulative AP Exam Study Guide

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks

Module 1. Identify parts of an expression using vocabulary such as term, equation, inequality

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Chapter 2 Descriptive Statistics

Practical Statistics for the Analytical Scientist Table of Contents

Summarising numerical data

Module 1: Equations and Inequalities (30 days) Solving Equations: (10 Days) (10 Days)

PROBABILITY DENSITY FUNCTIONS

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Measures of Location. Measures of position are used to describe the relative location of an observation

Homework Example Chapter 1 Similar to Problem #14

Chapter 1:Descriptive statistics

Glossary for the Triola Statistics Series

Statistics Add Ins.notebook. November 22, Add ins

Course ID May 2017 COURSE OUTLINE. Mathematics 130 Elementary & Intermediate Algebra for Statistics

REVIEW: Midterm Exam. Spring 2012

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.2 Business Applications (III)

Applications in Differentiation Page 3

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

ECLT 5810 Data Preprocessing. Prof. Wai Lam

BNG 495 Capstone Design. Descriptive Statistics

Number of fillings Frequency q 4 1. (a) Find the value of q. (2)

Statistical Methods. by Robert W. Lindeman WPI, Dept. of Computer Science

Basic Statistics. Resources. Statistical Tables Murdoch & Barnes. Scientific Calculator. Minitab 17.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

MAT Mathematics in Today's World

What are the mean, median, and mode for the data set below? Step 1

Chapter 3 Statistics for Describing, Exploring, and Comparing Data. Section 3-1: Overview. 3-2 Measures of Center. Definition. Key Concept.

Wednesday, 24 May Warm-Up Session. Non-Calculator Paper

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Describing distributions with numbers

* * MATHEMATICS 4732 Probability & Statistics 1 ADVANCED SUBSIDIARY GCE. Wednesday 21 January 2009 Afternoon. Duration: 1 hour 30 minutes.

Lecture 2. Descriptive Statistics: Measures of Center

Unit Six Information. EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15%

UCLA STAT 10 Statistical Reasoning - Midterm Review Solutions Observational Studies, Designed Experiments & Surveys

Edexcel GCE Statistics 2

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values.

Transcription:

Descriptive tatistics Example A manufacturer is investigating the operating life of laptop computer batteries. The following data are available. Life (min.) Life (min.) Life (min.) Life (min.) 130 145 16 146 164 130 13 15 145 19 133 155 1 17 139 137 131 16 145 148 15 13 16 16 16 135 131 19 147 136 19 136 156 146 130 146 13 14 13 13 Using the first two digits as stem we may develop the following plot: Freq. 1 5 6 9 7 6 6 6 9 6 9 10 13 0 1 0 5 6 3 9 1 0 7 6 16 14 5 0 7 5 6 5 6 8 6 10 15 6 5 3 16 4 1 tem-and-leaf plot The plot shows that most of the data is clustered around 130, with few data points crossing the 150 limit. One may conclude that the center of the data is somewhere in the 130s. Variation is harder to judge. Whether the variability is high or low can only be determined on a comparative basis at this stage. If another data set is available (may be for another brand), a back-to-back stem-and-leaf plot could be used to visually compare the variability in both sets. By ordering the leafs, we get the following plot: Freq 1 5 6 6 6 6 6 7 9 9 9 10 13 0 0 0 1 1 3 5 6 6 7 16 9 14 0 5 5 5 6 6 6 7 8 10 15 5 6 3 16 4 1 Ordered tem-and-leaf plot Descriptive tatistics Example 1 of 6 L. K. Gaafar

From the plot above, we may determine many measures of dispersion and central tendency: Minimum 15, Maximum 164, Range 164 15 39. Mode 16, 13 (both are repeated 5 times- Bimodal data) ( ) (13 133) ( ~ x[0] + x[1] + Median x) 13.5. Other measures require some calculations: Average ( x) x i i 1 (130 + 164 +... + 146 + 13) 136.85. These results confirm our initial conclusion that the center is in the 130s. Variance( s ) 9.79. i 1 ( x i 39 x) (130 136.85) +...(13 136.85) 39 95.87. Note: The average, median, mode, variance, and standard deviation may all be determined using the Excel functions AVERAGE, MEDIAN, MODE, VAR, and TDEV; respectively. Also, we may use the ordered tem-and-leaf plot (repeated below for convenience) to determine some probabilities: Freq 1 5 6 6 6 6 6 7 9 9 9 10 13 0 0 0 1 1 3 5 6 6 7 16 9 14 0 5 5 5 6 6 6 7 8 10 15 5 6 3 16 4 1 For example: Only 3 observations are not less than 155. Therefore, P(X<155) 37/ 0.95, or 9.5%. This means that, based on the data we have, we expect 9.5% of the batteries to fail before 155 minutes. 3 observations are 155 or above, P(X155) 3/ 0.075, or 7.5%. This is also the complement of the above probability (1-0.95). 1 observations are greater or equal to 1 and less or equal to 155. Therefore: P(1 X 155) 1/ 0.30, or 30%. Based on the data we have, we ex pect 30% of the batteries to last no less than1 minutes, but no more than 155 minutes. Descriptive tatistics Example of 6 L. K. Gaafar

Notice that all calculated probabilities are approximate estimates that will improve as the amount of data increases. We may develop a Frequency Distribution table for the data, by dividing its range to classes and counting the frequency of data in each class. The number of classes (c) should be between 5 and 0, but close to n, where n is the number of data points (n ). I our case, we should use about 6 or 7 classes. The class width (w) may be determined as w Range/c. In our case w 39/7 5.571. To simplify calculations, we may increase c to 8 and modify w to 5. If we start the first class at 15, its upper bound would be 130, and all other classes are determined accordingly. ince the lowest data point is 15, the lower class limit must be inclusive. The last upper class limit is one point above the maximum, guaranteeing that all data will be included. The following table shows the frequency distribution of the data. Class Interval Tally Frequency Cumulative Frequency Relative Frequency Cumulative Relative Frequency 15 X<130 10 10 0.5 0.50 130 X<135 11 1 0.75 0.55 135 X<1 5 6 0.15 0.650 1 X<145 8 0.05 0.700 145 X<150 8 36 0.0 0.900 150 X<155 1 37 0.05 0.95 155 X<160 39 0.05 0.975 160 X<165 1 0.05 1.000 The following histogram is a graphical depiction on the frequencies above. It shows that most of the data are clustered around 135, with few points above 150. Descriptive tatistics Example 3 of 6 L. K. Gaafar

A cumulative relative frequency plot may be used to calculate various probabilities. For example, in the plot below, we see that the probability of a battery life of less than 150 is 0.95. If our frequency distribution was developed with inclusive upper bounds, we may obtain cumulative probabilities directly from the graph. To do that, we should start the first class from 14 to include all data. Consequently, the upper limit of the last class would be 164. Descriptive tatistics Example 4 of 6 L. K. Gaafar

Now, let us assume that another data set of points is available for another brand of batteries (Battery ). Life (min.) Life (min.) Life (min.) Life (min.) 134 130 1 151 143 134 136 144 150 135 160 141 143 1 138 141 148 146 1 146 151 138 151 139 151 18 146 147 15 14 144 134 14 146 14 136 1 134 145 147 The measures of center and dispersion for Battery are: Minimum 1, Maximum 161, Range 161 1 39. Mode 134, 146, 151 (all repeated 4 times- Multi-modal data) ( ) (14 14) ( ~ x[0] + x[1] + Median x) 14. x i i 1 Average ( x) 14. ymmetric data (Average Median). Descriptive tatistics Example 5 of 6 L. K. Gaafar

Variance( s ) 7.43. i 1 ( x i 39 x) 55.. These results show numerically that Battery has a higher average life with slightly less variation. An easy way to graphically compare the two sets is to develop a back-to-back stem-and-leaf plot. Freq Battery Battery 1 Freq 8 1 5666667999 10 11 988665444 13 00011356679 16 0 87766665443311000 14 055566678 10 6 11110 15 56 3 1 0 16 4 1 Back-to-Back tem-and-leaf Plot The plot above shows that more data for Battery are in the 1s compared to the 130s for Battery 1. Also, the spread (variability) of Battery is less than that of Battery 1. Based on these results, we may conclude that Battery is a better brand (higher average and lower variability). The validity of this conclusion, however, depends on how data are collected and the sufficiency of n. These issues are typically discussed as part of Inferential tatistics and Design of Experiments. A better graphical comparison tool is the box (box-and-whisker) plot. A plot for both data sets is shown below. Box Plot The plot above supports our previous conclusion as the interquartile range of Battery is shorter than that of Battery 1 (less variability), and is shifted to the right (higher center). Descriptive tatistics Example 6 of 6 L. K. Gaafar