F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

Size: px
Start display at page:

Download "F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives"

Transcription

1 F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested for 2 years. If we use x to represent the interest rate. After one year, the value will be 100(1 + x) After 2 years, the value will be 100(1 + x) 2 If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives Example: 100 invested. What interest rate would give a value of 115 after 2 years? Require 100(1 + x) 2 = 115 So (1 + x) 2 = 1.15 So 1 + x = So x = 7.24% Revision: Rules of Arithmetic Order of calculations: 1. Evaluate expressions within brackets. 2. Evaluate functions (e.g. square root, power, log, exp). 3. Evaluate multiplications and divisions. 4. Evaluate additions and subtractions. Note that multiplying two negative numbers gives a positive answer. Examples (using positive value of square root): = (1 + 3) 9 = = = ( 1) 2 + ( 3) 2 = (1 + 3) 2 = 16 Minus Signs Note: 1 2 = 1 is not the same as ( 1) 2 = +1 What is meant by expressions such as: 1 2? Answer: Take the negative number 1 and then subtract 2 from it. Note that the minus sign is used in two different ways: 1

2 As a unary operator to change 1 to 1 As a binary operator to subtract 2 from 1 The above shows how to interpret expressions like 1 2. The second minus sign is a unary minus, so 2 is subtracted from 1, giving the answer 3. 2

3 Histograms Suitable for continuous data. Data values are grouped into Bins and the numbers in each bin are counted. Bins need not be all of the same width. Examples: Student heights, House prices. Construction: 1. Choose bin width so that there are between 6 and 20 bins. Use more bins for larger sample sizes. First bin should contain the minimum value and last should contain the maximum. 2. Construct the frequency table. 3. Draw histogram. The Area of each rectangle should be proportional to the number in the bin. Note: Minitab does not allow unequal bin widths. Drawing Histograms Rule for unequal bin widths: Area Frequency Height Width Frequency Height Frequency Width Example: The Edinburgh house price data is a case where different bin widths should be used. Band Width Freq. F W 200F W The height of each rectangle must be proportional to the value in the penultimate column of the table it may be more convenient when plotting by hand to multiply these values by a suitable constant, such as

4 Figure 1: Histrogram of Student heights, equal bin sizes Figure 2: Histrogram of House Prices, equal bin sizes 4

5 Histogram of Price Density Price Figure 3: Histrogram of House Prices, unequal bin sizes Stem-and-Leaf Plots Similar to histogram, but contains much more information. Bin widths are taken as a power of 10 times either 2 or 5 or 10. Suitable for continuous or discrete data. Example: Heights (in cm.) of a random sample of 60 students: Find minimum and maximum (155, 191). Decide on bin size 2 s will give 19 bins and 5 s will give 8, so use latter. Write Stem 15, 16, 16, 17, 17, 18, 18, 19. 5

6 Go through data set writing next digit ( Leaf ) against the appropriate stem value. Where a stem value is repeated, leaves 0 to 4 go against the first and 5 to 9 against the second. Rewrite table with leaf values in ascending order. Optional extra: add cumulative frequencies to the plot. Note that the numbers in a stem-and-leaf plot may be truncated. statistics obtained by using them are sometimes rather too small. Thus any summary Minitab output. Stem-and-leaf of heights N = 60 Leaf Unit = (20) Notes: The first column above gives cumulative frequencies; these are calculated from below and from above. The (20) indicates that there are 20 students in the category where these meet. The reason that there are only a limited number of different leaf values is because the original values were recorded in inches and then converted to centimetres. The following is a plot of the same data in inches here each bin on the stem has two leaf values associated with it. Stem-and-leaf of C2 N = 60 Leaf Unit = (19)

7 Note that changes in the starting value and to the bin width mean that the outline has a slightly different shape from the previous plot. Graphical methods are often a useful first step in looking at sample data and are also useful for presenting conclusions to others. If we wish to draw conclusions about the data, we usually need to use numerical methods. Averages (Rees ) There are many summary statistics that can be calculated for quantitative variables. Those that give the location are the most important. There are several ways of describing where the centre of the data is: the sample mean and the median are the commonest. Sample Mean The following symbols will be used to describe the calculations: Sample size: n (number of values in sample) Data values: Sample mean: x 1, x 2,..., x n x x = sum of data values sample size = x 1 + x x n n = 1 n x i n i=1 The suffix notation is used to make it clear which values are being added. This can often be omitted: x x = n Examples: Data: 1, 2, 3 Total = 6 Mean = 2.0 Data: 4, 6 Total = 10 Mean = 5.0 Data: 1, 2, 3, 4, 6 Total = 16 Mean = 3.2 Note that the overall mean is not the average of the two separate means. Example: Failure times. Fourteen electrical components were tested to destruction. Failure times in hours were: 7

8 n = 14 xi = = 1138 x = = 81.3 Mean failure time = 81.3 hours To calculate an overall mean from group means, it is important to allow for any differences in sample sizes. One way to do this is by calculating the individual totals. These are then added to get the overall total and hence the overall mean. Example: Mean age of 50 males = 22.6 years Mean age of 30 females = 19.4 years Mean age of combined group of 80 people? Male total = = 1130 Female total = = 582 Mean age = = 21.4 years Example: Mean salary of 50 people = Mean salary of 30 males = Overall total = = Male total = = Female total = = Female mean = = Sometimes data is available as a frequency table. Example: Articles produced in a manufacturing process were examined by taking regular samples of 20 articles. The number of defective articles in each sample was noted. The following data were obtained: Number of defectives Frequency

9 Note that the 11 in the frequency table means that the value 0 occurred 11 times. So: n = = 40 xi = = = = 65 x = = The general rule for frequency data is: x = fi x i fi where f i is the frequency of outcome x i Averages Median The median is the middle value when the data have been sorted. The median is used when the data set contains extreme values which would distort the mean. For example, in data set B (failures), x = This value is rather larger than what might be considered a typical value from this data set. The data must be sorted before the median can be calculated. If we have n data points then: if n is odd, it is the ( ) n+1 2 sorted value. if n is even, it is the average of the ( ) ( n 2 and the n + 2 1) sorted values. Example: Failure times Sorted data: 4, 5, 10, 28, 37, 45, 55, 75, 76, 82, 102, 139, 197, 283 n = 14 so require the average of the 7th and 8th sorted values. So median = = 65 hours. The Median, Robustness and Skewness The mean is the best measure of location if the shape of the plotted data is roughly symmetric. The median is a more robust measure of what is typical than the mean. i.e. the 9

10 median is not really affected by a few extreme values. Example: Consider the two artificial data sets {1, 2, 3, 4, 5} and {1, 2, 3, 4, 90}. Both sets have a median of 3, but the mean of the first is 3 and the mean of the second is 20. Example: Failure data. The extreme value of 283 distorted the mean. This is an example of data that are skew. A plot shows that the data are not symmetric, but have a tail on the right. Other measures of location are sometimes used. Minitab uses the trimmed mean the largest and smallest 5% of values are removed and the mean of the remaining 90% is calculated. For frequency tables, the mode is sometimes used; this is the most frequent category (or group). 10

11 Measures of Spread (Rees ) It is important to compute a measure of spread as well as an average. In other words, a measure of whether the data values are spread out or are bunched together. The simplest measure is: Range = Maximum Minimum This is not very satisfactory because it tends to increase as the sample size increases. Another possible measure of spread is to take the (positive) distance d i of each point x i from the mean, i.e. d i = x i x. The mean of these distances could be used, but this turns out to be awkward to work with, so a different measure is used. Standard deviation (Rees 4.8) 1. Find the squared distance between x i and x d 2 i = (x i x) 2 2. Add these up and divide the total by n 1 (instead of n) to find a mean. 3. Undo the squaring s = (xi x) 2 n 1 Notes: 1. Alternative formula (easier to compute). s = 1 n 1 ( x 2 i ( ) x i ) 2 n 2. The divisor n 1 is used because estimating x uses up one of the pieces of information that we have. On the very rare occasions that the population mean is known exactly, the divisor n is used. This is called the population standard deviation to distinguish it from the sample standard deviation, which is the usual form. 3. Many calculators provide a quick way to find a sample mean and standard deviation. There will often be a button labelled σ n 1 4. The standard deviation is in the same units as the data. E.g. if x i is measured in cm, then s is measured in cm. 5. The larger the value of s, the more the data is spread out about the mean. It is analogous to the moment of inertia in physics. 6. The square of the standard deviation is called the Variance. 11

12 Example: 3 data values 2, 3, 7 n = 3 x = = 4 3 (x i x) = 2, 1, 3 n (x i x) 2 = = 14 i=1 Sample variance = 14 = 7 (3 1) Alternatively n x i = = 12 i=1 n x 2 i = = 62 i=1 ( Variance = 1 (3 1) ) = 1 (62 48) = 7 2 So sample Standard Deviation = 7 = The Greek letter σ (sigma) is often used to denote standard deviation. Calculators often have keys labelled σ n and σ n 1 The latter is the one to use for the sample S.D. Example: Failure times n = 14 xi = 1138 x 2 i = = s = 1 13 ( = 13 = 79.2 hours ) More Standard Deviation Examples: In each case, the mean is 5.0 Sample data: 5, 5, 5 Standard Deviation = 0.0 Sample data: 4, 5, 6 Standard Deviation = 1.0 Sample data: 1, 5, 9 Standard Deviation = 4.0 Sample data: 1, 1, 5, 5, 9, 9 Standard Deviation =

13 Sample data: 1, 2, 3, 4, 5, 6, 7, 8, 9 Standard Deviation = 2.74 An earlier example showed that when groups are combined, the overall mean is obtained by adding totals rather than averaging means. To obtain an overall standard deviation, it is necessary to add up the sums of squared values. Example: Data: 1, 2, 3 Total SS = 14 S.D. = 1.00 Data: 4, 6 Total SS = 52 S.D. = 1.41 Data: 1, 2, 3, 4, 6 Total SS = 66 S.D. = 1.92 The overall s.d. = ( /5)/4 = 1.92 Note that the abbreviations SS and S.D. are often used for the Sums of Squares and for Standard Deviation respectively. It is possible to calculate the total and total sums of squares from the group sample size, mean and standard deviation. This is done by reversing the calculations above. For example: x = nx Beware of rounding errors if you do this. Standard deviation from a frequency table It is possible to enter all the data directly, but it is usually best to use the table by adding extra columns. Example: Number of defectives Number Freq. x f fx fx Totals Thus: n = 40 x = 65 x 2 =

14 So: x = = s.d. = 1 (40 1) Accuracy ( ) = 1.53 It is rarely useful to report the value of a standard deviation to more than 3 significant figures. So: Report as 1.23 or possibly and as 98.8 A mean should usually be reported to the same number of decimal places as the corresponding standard deviation, or to one less decimal place. So for the two standard deviations above: A mean of might be given as 5.43 and a mean of might be given as 543. Notes 1. It is important to not round off numbers too early, especially when finding standard deviations. 2. If reported values are used in other calculations, the accurate values should be used, rather than the rounded values. Rounding If a number is exactly 2.345, there is no universally agreed way to round to 2 d.p. A good rule is to round to the nearest even number. So: Inter-quartile range (Rees 4.9) If the data is skew, the inter-quartile range may be a better measure of spread than the standard deviation. The lower quartile Q1 is value such that a quarter of the sample takes values less than Q1. How do we calculate it? If we have n data points arranged in ascending order then Q1 is the ( ) n+1 4 st observation. The upper quartile Q3 is value such that a quarter of sample takes values greater than Q3. How do we calculate it? If we have n data points arranged in ascending order then Q3 is the ( ) ( ) 3(n+1) 4 st observation. Equivalently, it is the n+1 4 st observation when counting down 14

15 from the largest value. The inter-quartile range (IQR) is defined to be IQR = Q3 Q1. Note: This is the method of calculation used by Minitab. Some books use slightly different ways of estimating the quartiles. You may also encounter Deciles and Percentiles; these divide data into tenths and hundredths. Example: Failure times n=14 So Q1 is the n+1 4 = = 3.75th observation. Q3 is the 3(n+1) 4 = 3(14+1) 4 = 11.25th observation. Data: 4, 5, 10, 28, 37, 45, 55, 75, 76, 82, 102, 139, 197, 283 3rd observation = 10 and 4th observation = 28 So: Q1 = 10 + (0.75)(28 10) = = 23.5 hours 11th observation = 102 & 12th observation = 139 So: Q3 = (0.25)( ) = = hours The inter-quartile range IQR = = hours. Note: A stem-and-leaf plot presents data in a sorted form, so can be used to find the median and quartiles. However, the resulting values may be rather too small, because the plot sometimes truncates numbers. Box Plot The quartiles can be used to create a display of the data called a box-and-whisker plot or box plot. The box is formed from the quartiles and the whiskers connect the box to the maximum and the minimum. 15

16 Min LQ Median UQ Max If the data are skew, the median will not be near the middle of the box, and one whisker will be much longer than the other. The values used in drawing a boxplot are called a five number summary. Example: Failure times The five number summary is {4, 23.5, 65, , 283} 16

17 Properties of Mean and Standerd Deviation If data are roughly symmetrical about mean, then: Approximately 2 will be within 1 s.d. of the mean 3 Approximately 95% will be within 2 s.d. of mean Usually all will be within 3 s.d. of the mean. The Inter-Quartile Range will be approximately 1.35 standard deviations. Standard Scores An individual data point could be considered to be extreme if it is several standard deviations away from the mean. The standard score (z-score, standardised value) for x is: z = x x s This measures how many standard deviations x is above or below the mean. Example: Failure times Recall that x = 81.3 and s = This suggests that 283 might be unusual. = 0.33 small z (typical) = 2.55 large z (extreme) Change of Scale If a constant is added to all of the data values, the Mean is increased by the same constant; the S.D. is unchanged. If all the data values are multiplied by a constant, the Mean and S.D. are both multiplied by the same constant. Example: Temperature conversion from C to F. Need to multiply by 1.8 and add 32. Celsius Mean = 15 C and S.D. = 5.5 C Fahrenheit Mean = = 59 F Fahrenheit S.D. = = 9.9 F 17

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Topic 2 We next look at quantitative data. Recall that in this case, these data can be subject to the operations of arithmetic. In particular, we can add or subtract observation values, we can sort them

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Chapter 1 - Lecture 3 Measures of Location

Chapter 1 - Lecture 3 Measures of Location Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

STATISTICS 1 REVISION NOTES

STATISTICS 1 REVISION NOTES STATISTICS 1 REVISION NOTES Statistical Model Representing and summarising Sample Data Key words: Quantitative Data This is data in NUMERICAL FORM such as shoe size, height etc. Qualitative Data This is

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

2.1 Measures of Location (P.9-11)

2.1 Measures of Location (P.9-11) MATH1015 Biostatistics Week.1 Measures of Location (P.9-11).1.1 Summation Notation Suppose that we observe n values from an experiment. This collection (or set) of n values is called a sample. Let x 1

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Describing Distributions

Describing Distributions Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?

More information

Describing Distributions With Numbers

Describing Distributions With Numbers Describing Distributions With Numbers October 24, 2012 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Do

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Measures of Central Tendency

Measures of Central Tendency Measures of Central Tendency Summary Measures Summary Measures Central Tendency Mean Median Mode Quartile Range Variance Variation Coefficient of Variation Standard Deviation Measures of Central Tendency

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Lecture 1: Measures of Central Tendency and Dispersion Donald E. Mercante, PhD Biostatistics May 2010 Biostatistics (LSUHSC) Chapter 2 05/10 1 / 34 Lecture 1: Descriptive

More information

Summarising numerical data

Summarising numerical data 2 Core: Data analysis Chapter 2 Summarising numerical data 42 Core Chapter 2 Summarising numerical data 2A Dot plots and stem plots Even when we have constructed a frequency table, or a histogram to display

More information

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Preliminary Statistics course. Lecture 1: Descriptive Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.

More information

Quartiles, Deciles, and Percentiles

Quartiles, Deciles, and Percentiles Quartiles, Deciles, and Percentiles From the definition of median that it s the middle point in the axis frequency distribution curve, and it is divided the area under the curve for two areas have the

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population . Measures of Central Tendency: Mode, Median and Mean Average a single number that is used to describe the entire sample or population. Mode a. Easiest to compute, but not too stable i. Changing just one

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004 UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 50W - Introduction to Biostatistics Fall 00 Exercises with Solutions Topic Summarizing Data Due: Monday September 7, 00 READINGS.

More information

Whitby Community College Your account expires on: 8 Nov, 2015

Whitby Community College Your account expires on: 8 Nov, 2015 To print higher resolution math symbols, click the Hi Res Fonts for Printing button on the jsmath control panel. If the math symbols print as black boxes, turn off image alpha channels using the Options

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Fifth Grade Mathematics Mathematics Course Outline

Fifth Grade Mathematics Mathematics Course Outline Crossings Christian School Academic Guide Middle School Division Grades 5-8 Fifth Grade Mathematics Place Value, Adding, Subtracting, Multiplying, and Dividing s will read and write whole numbers and decimals.

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Math Literacy. Curriculum (457 topics)

Math Literacy. Curriculum (457 topics) Math Literacy This course covers the topics shown below. Students navigate learning paths based on their level of readiness. Institutional users may customize the scope and sequence to meet curricular

More information

Lecture 1 : Basic Statistical Measures

Lecture 1 : Basic Statistical Measures Lecture 1 : Basic Statistical Measures Jonathan Marchini October 11, 2004 In this lecture we will learn about different types of data encountered in practice different ways of plotting data to explore

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

Lecture 11. Data Description Estimation

Lecture 11. Data Description Estimation Lecture 11 Data Description Estimation Measures of Central Tendency (continued, see last lecture) Sample mean, population mean Sample mean for frequency distributions The median The mode The midrange 3-22

More information

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers Population - all items of interest for a particular decision

More information

Week 1: Intro to R and EDA

Week 1: Intro to R and EDA Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

KCP e-learning. test user - ability basic maths revision. During your training, we will need to cover some ground using statistics.

KCP e-learning. test user - ability basic maths revision. During your training, we will need to cover some ground using statistics. During your training, we will need to cover some ground using statistics. The very mention of this word can sometimes alarm delegates who may not have done any maths or statistics since leaving school.

More information

Describing Distributions With Numbers Chapter 12

Describing Distributions With Numbers Chapter 12 Describing Distributions With Numbers Chapter 12 May 1, 2013 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary. 1.0 What Do We Usually Summarize? source: Prof.

More information

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22 Announcements Announcements Lecture 1 - Data and Data Summaries Statistics 102 Colin Rundel January 13, 2013 Homework 1 - Out 1/15, due 1/22 Lab 1 - Tomorrow RStudio accounts created this evening Try logging

More information

The empirical ( ) rule

The empirical ( ) rule The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Lecture 1: Description of Data. Readings: Sections 1.2,

Lecture 1: Description of Data. Readings: Sections 1.2, Lecture 1: Description of Data Readings: Sections 1.,.1-.3 1 Variable Example 1 a. Write two complete and grammatically correct sentences, explaining your primary reason for taking this course and then

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Lecture 2. Descriptive Statistics: Measures of Center

Lecture 2. Descriptive Statistics: Measures of Center Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences

More information

PhysicsAndMathsTutor.com

PhysicsAndMathsTutor.com 1 (i) 0 6 1 5 8 2 1 5 8 3 1 1 3 5 8 9 Key 1 8 represents 18 people Stem (in either order) and leaves Sorted and aligned Key Do not allow leaves 21,25, 28 etc Ignore commas between leaves Allow stem 0,

More information

Review for Algebra Final Exam 2015

Review for Algebra Final Exam 2015 Review for Algebra Final Exam 2015 Could the data in the table represent a linear model. If Linear write an equation to model the relationship. x Y 4 17 2 11 0 5 2 1 4 7 Could the data in the table represent

More information