Chapter 4: Displaying and Summarizing Quantitative Data

Size: px
Start display at page:

Download "Chapter 4: Displaying and Summarizing Quantitative Data"

Transcription

1 Chapter 4: Displaying and Summarizing Quantitative Data This chapter discusses methods of displaying quantitative data. The objective is describe the distribution of the data. The figure below shows three idealized distributions spanning a range of values from 0 to 10. There are a few key features of a distribution that convey nearly all of the information contained in the data. They are 1. The center and most common values (modes).. The spread of the distribution. 3. The shape. Shapes are symmetric (black), skewed right (red), and skewed left (blue). 4. Number of modes. The lower figure on this page shows a distribution with two modes Deviations from overall shape and outliers. Outliers are values that are far away from the majority of values. Stemplot: A stemplot is a plot which shows the distribution and includes the numerical values on the plot. The method for drawing a stemplot is illustrated with the following example. Example: How effective is antibacterial soap? To investigate this question, data were collected on the number of bacterial colonies present on previously sterile media plates two days after placing a hand washed with water (group 1) and antibacterial soap (grouop ) on the plates. The objective is to compare the two distributions of bacterial colonies. Specifically, the aim is to determine whether the antibacterial soap plates have fewer colonies that the water-only plates. More generally, the objective is to characterize the distributions with respect to the center, spread and shape. 18

2 The data on number of colonies are Group 1: Washed with water Group : Washed with soap To make a stemplot, the rightmost digit of a number becomes a leaf, and the remaining digits become a stem. The rightmost digit is written on the stem. The stemplot constructed from the anti-bacterial soap plates is on the right. The distribution of the number of 7 7 colonies is skewed to the right and is centered around 105 colonies. There is one large outlier (189). Most of the distribution is between 87 and colonies Notes: 14 9 Decimals points are omitted from the stemplot. The spacing between values remains the same along a row of the stemplot The advantage of a stemplot is that it is simple and quick and is the useful for handdrawn distributional displays. Also, the individual data are retained. Its disadvantage is that it is awkward or impossible to use with large data sets. A visual comparison of the bacterial counts for the antibacterial soap and the water-only plates is obtained from a back-to-back stemplot. The back-to-back stemplot uses a common stem with leaves branching off in opposite directions, as shown below. 19

3 Water only Antibacterial soap The back-to-back stemplot stemplot reveals 5 9 that the water-only distribution is shifted towards larger values and is right-skewed. The antibacterial distribution is not skewed Another simple method appropriate only for small data sets is the dotplot. It shows individual data values along with an identifier of the observational unit. The dotplot below shows annual precipitation for 70 large U.S. cities. Mobile Miami San Juan New Orleans Juneau Jacksonville Jackson Memphis Little Rock Atlanta Houston Columbia Nashville Atlantic City Norfolk Hartford Louisville Providence Charlotte Richmond Raleigh Boston Baltimore Charleston New York Wilmington Philadelphia Cincinati Washington Seattle Tacoma Indianapolis Portland Columbus Kansas City Pittsburg Concord Buffalo Dallas St Louis Peoria Cleveland Chicago Albany Burlington Sault Ste. Marie Oklahoma City Detroit Des Moines Wichita Omaha Duluth Milwaukee Minneapolis/St Paul Sioux Falls Honolulu San Francisco Spokane Sacramento Bismark Salt Lake City Great Falls Cheyenne Los Angeles Denver Boise El Paso Albuquerque Reno Phoenix Inches per year 0

4 A histogram is a plot which breaks the range of the data (smallest to largest) into intervals and displays the frequency (or relative frequency) of the observations that fall into each interval. Histograms are the usual approach to displaying the distribution of large data sets. Example: The annual precipitation data are summarized in a table and a histogram (below). The table and histogram show the number of cities with annual precipitation means falling in each of seven intervals. Interval Relative A histogram is constructed by (Inches) Frequency Frequency forming intervals (categories, actually), of 0 10 inches, inches and so on, and counting the number of values that belong to each interval The choice of intervals is somewhat subjective, and some trial and error may be necessary to produce a good histogram. Statistical software are usually good at choosing appropriate intervals. The histogram reveals that the distribution of precipitation for the 70 cities is roughly unimodal and centered near 36 inches. Most of the values are between 10 and 60. (The dotplot reveals that 10 to 53 is a better characterization of spread of the data). Frequency Remarks: Inches per year The interval widths are equal. This is crucial for obtaining a histogram which accurately reflects the distribution. Intervals and bars are contiguous. 1

5 Relative frequency (or percentage or proportion) could be used for the bar heights instead of the frequency. The primary advantage of the histogram over the stemplot is that arbitrarily large data sets can be be displayed using a histogram. The primary disadvantage of the histogram is that it does not retain the actual data values in the plot. Every graph (not just histograms) must have labels for the horizontal axis, and almost always for the vertical axis. When examining a graph, the first place you should look is at the axis labels. Distributional shapes: The important characteristics of shape are 1. Mode: The term mode is synonymous with peak. A distribution with one peak is said to be unimodal; a distribution with two peaks is bimodal; and a distribution with more than two peaks is called multimodal. If a distribution does not have a distinct mode, the distribution is said to be uniform, or approximately uniform.. Skewness: Skew describes the length of the tails of the distribution. A unimodal distribution will either be symmetric, have a longer tail toward larger values and be called skewed to the right, or have a longer tail toward smaller values and be called skewed to the left. The first figure on page 18 illustrates skew. 3. Unusual features: Unusual features of a distribution often tell something interesting about the data (and the population from which they were sampled). Unusual features are principally outliers 1 (values unusually distant from the bulk of the distribution). Numerical summaries of quantitative data There are two primary features of the distribution of a quantitative variable: the center and spread. Statistics used to describe center, spread, and the shape are tabled to the right. Feature Center Spread Shape Measures Median Mean Midrange Mode Interquartile range (IQR) Standard deviation Range 5-Number summary 1 A rule for identifying outliers will be introduced later.

6 Measures of center 1. The median (M) of a distribution is a value which divides the ordered data values into two sets equal in number. To find the median (a) Order the data from smallest to largest. (b) Determine the number of data values (usually called the sample size and denoted by n). Note whether n is even or odd. i. If n is odd, then the median is the value at the middle point in the ordered list; specifically the median is the value at the n+1 th position. ii. If n is even, the median is between the two middle values of the ordered list. These are located at position n and n + 1. The convention is to take the average of these two middle values. Example: Return to the annual precipitation data: (a) The sample size is n = 70, and so the median is the average of the 35th and 36th smallest values. They are 36. and 37.0 (Pittsburg and Kansas City). The average of these values is the median; hence M = = 36.6 inches. (b) Suppose that we include Missoula in the data set. The average annual precipitation for Missoula is 13.7 inches. What is the median of this set of data?. The mean of a set of data is the average. Let y 1, y,..., y n denote n data values. The mean is y = 1 n n i=1 y i = 1 n (y 1 + y + + y n ) = y y n. n The symbol y often is referred to as y-bar. The notation 1 n y sometimes is used for the mean. The mean of the precipitation data is y = = 34.9 inches. With Missoula included, the mean is 34.6 inches. After including Missoula, n = 71, so the n+1 = 36th smallest ordered value is the median. Pittsburg was 35th; now it s 36th and so the new median is 36. inches. 3

7 3. The mode of a data set is the value that occurs most frequently. If there are multiple values that occur most frequently, all of these values are modes of the distribution. The mode generally is not an appropriate measure of center. To illustrate, I ve rounded the precipitation data to the nearest integer, and used to these data construct the histogram to the below. The modes are 36 and 43 inches. The characterization of the center by these modes is not particularly informative. The histogram also uninformative as the intervals are too small to adequately portray the shape of the data. A statistic is resistant if it is not substantially affected by Frequency changes in the numerical values of a small proportion of the observations Annual precipitation (inches) Outliers and long tails often substantially affect a statistic that is not resistant. A statistic which is not resistant to some distributional feature is sensitive to that feature. Sensitivity to outliers The median is resistant to the effects of outliers whereas the mean is sensitive. To illustrate, consider the effects of accidentally recording 670 inches for Mobile instead of The mean would have been computed as 43.5 inches, but the median (36.6 inches) would not be different. The effect of skew differs between mean and median, specifically, the mean is shifted toward a long tail compared to the median. (The median resists being shifted toward a long tail). Symmetric: M = y; Skewed right: M < y; Skewed left: y < M. 4

8 Percentiles and quartiles The p th percentile of a distribution is that value such that p% of the data values fall below it. If your SAT math percentile was 80%, then your score was larger than 80% of all scores (and smaller than 0%). The quartiles are the 5 th, 50 th and 75 th percentiles; and so they divide the data set into four sets of equal size. The notation is Measures of spread Q 1 = 5 th percentile = 1 st quartile M = 50 th percentile = nd quartile, or the median Q 3 = 75 th percentile = 3 rd quartile 1. The range of the data is the difference between the maximum and minimum values: Range = Max Min. The range is too sensitive to outliers to be of much use.. The interquartile range is the distance between the 5 th and 75 th percentiles; hence, Remarks: IQR = Q 3 Q 1. The IQR measures the spread of the middle 50% of the data. The IQR is a single number, not Q 1 and Q 3. Instead, it is the width of the interval between Q 1 and Q 3. There are several algorithms for finding the quartiles, all of which find the median and use it to divide the data into upper and lower halves and find the medians of each half; these medians are the quartiles. Deveaux et al. recommend this algorithm: If n is even 3, then include the n smallest value in the lower data half and include the n + 1 smallest value in the upper data. If n is odd4, the include the median (the n + 1 smallest value) in both the lower and upper halves of the data. 3 then the median is the average of two middle values 4 then the median is the single middle value 5

9 To illustrate, consider the following data on mean temperature, by month, for Missoula (units are degrees Fahrenheit). Month J F M A M J J A S O N D Temp The ordered values are Order Temp Then, and Q 1 = = 9.8 degrees Q 3 = = degrees IQR = = 8.05 degrees The monthly temperature quartiles for San Francisco are Q 1 = 53.0, Q = and Q 3 = The following table compares Missoula, San Francisco, and one other city. Table 1: Measures of center and spread for Missoula and San Francisco (degrees Fahrenheit). City M IQR Missoula San Francisco ? The figure to the right is a time plot of the mean monthly temperature against month. From this Figure, the constancy of temperatures in San Francisco becomes strikingly obvious. It is also apparent that the mystery city is generally colder than Missoula and that the greatest difference in mean monthly temperatures occurs in the summer and winter. degrees F Missoula San Francisco? Month

10 3. The standard deviation is the most commonly used numerical summary of distributional spread. It is (roughly) the average difference between the mean y and the data values. Recall: The values in a data set are denoted y 1, y,..., y n for a sample size of n. The standard deviation is computed from the deviations of the observed values from the mean, namely: y 1 y, y y,..., y n y. Since the deviations sum to 0, the average deviation is not a measure of spread. To rectify this problem, the standard deviation is computed from the squared deviations (which are all greater than or equal to 0). The squared deviations are summed and divided by n 1. The final operation computes the square root. A formula for the standard deviation is (y y) s = n 1. This is roughly the average distance of the data values from the mean, which is a logical measure of spread. The term roughly is used because n 1 is the denominator rather than n. Taking the square root puts it back in the original units of measure. Squaring the standard deviation gives the variance. The relationship between the two are summarized by var = s s = var A related, alternative measure is the median absolute deviation about the median: 1 yi M. n 7

11 Example: To illustrate, the standard deviation of monthly temperature averages from Vostok, Antarctica 5 (elevation: 110 feet, latitude: 78 7S, longitude: 106 5E) is computed: First, the annual mean temperature is y = 68.5 degrees F. The three columns on the right show the intermediate steps; the formula on the left shows the last stages of the calculation. n i=1 s = (y i y) n = 11 = = 3.9 degrees F s = 3.9 degrees F is interpreted as the average difference between the annual mean temperature (y = 68.5) and the monthly mean temperatures. For comparison, the standard deviation y i (y i y) (y i y) Total of the Missoula temperatures was s = 16.1 degrees F. There s considerably greater variability in month mean temperatures at Vostok compared to Missoula. The 5-number summary numerically summarizes the shape of distribution. It is (Min, Q 1, M, Q 3, Max). To compare the three cities more closely, their 5-number summaries are: City Min Q 1 M Q 3 Max? Missoula San Francisco The mystery city does appear colder than Missoula throughout the year. 5 The lowest recorded temperature in 3 years of records was 17 degrees F. For comparison, the freezing point of CO is degrees F. 8

12 Summary of resistant and sensitive measures: The mean is sensitive to the effects of outliers, whereas the median is resistant to the effects of outliers. The standard deviation is sensitive to the effects of outliers whereas the IQR is resistant to the effects of outliers. The IQR is resistant because it is the difference between Q 3 and Q 1. No outliers are used in the calculation of the IQR since unusually large observations have little effect Q 3 and Q 1. In contrast, all data values (including outliers) are used in the calculation of the standard deviation. Summarizing a distribution with a measure of center and spread Since the mean and standard deviation are not resistant, they are not appropriate for skewed distributions or distributions with outliers. They re most appropriate for symmetric distributions with no outliers. Situation Symmetric distribution with no outliers Skewed distributions Symmetric distributions with outliers Measures to use Mean and SD or median and IQR Median and IQR Median and IQR 9

Kathryn Robinson. Grades 3-5. From the Just Turn & Share Centers Series VOLUME 12

Kathryn Robinson. Grades 3-5. From the Just Turn & Share Centers Series VOLUME 12 1 2 From the Just Turn & Share Centers Series VOLUME 12 Temperature TM From the Just Turn & Share Centers Series Kathryn Robinson 3 4 M Enterprises WriteMath Enterprises 2303 Marseille Ct. Suite 104 Valrico,

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

HI SUMMER WORK

HI SUMMER WORK HI-201 2018-2019 SUMMER WORK This packet belongs to: Dear Dual Enrollment Student, May 7 th, 2018 Dual Enrollment United States History is a challenging adventure. Though the year holds countless hours

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Solving Quadratic Equations by Graphing 6.1. ft /sec. The height of the arrow h(t) in terms

Solving Quadratic Equations by Graphing 6.1. ft /sec. The height of the arrow h(t) in terms Quadratic Function f ( x) ax bx c Solving Quadratic Equations by Graphing 6.1 Write each in quadratic form. Example 1 f ( x) 3( x + ) Example Graph f ( x) x + 6 x + 8 Example 3 An arrow is shot upward

More information

Lab Activity: Weather Variables

Lab Activity: Weather Variables Name: Date: Period: Weather The Physical Setting: Earth Science Lab Activity: Weather Variables INTRODUCTION: A meteorologist is an individual with specialized education who uses scientific principles

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

NAWIC. National Association of Women in Construction. Membership Report. August 2009

NAWIC. National Association of Women in Construction. Membership Report. August 2009 NAWIC National Association of Women in Construction Membership Report August 2009 Core Purpose: To enhance the success of women in the construction industry Region 1 67 Gr Washington, DC 9 16 2 3 1 0 0

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

Research Update: Race and Male Joblessness in Milwaukee: 2008

Research Update: Race and Male Joblessness in Milwaukee: 2008 Research Update: Race and Male Joblessness in Milwaukee: 2008 by: Marc V. Levine University of Wisconsin Milwaukee Center for Economic Development Briefing Paper September 2009 Overview Over the past decade,

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

1. Evaluation of maximum daily temperature

1. Evaluation of maximum daily temperature 1. Evaluation of maximum daily temperature The cumulative distribution of maximum daily temperature is shown in Figure S1. Overall, among all of the 23 states, the cumulative distributions of daily maximum

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

JUPITER MILLER BUSINESS CENTER 746,400 SF FOR LEASE

JUPITER MILLER BUSINESS CENTER 746,400 SF FOR LEASE 746,400 SF FOR LEASE Three LEED Certified Cross-Dock Buildings 54,600 Square Feet to 746,400 Square Feet Available Dallas City of Tax Incentives Available 36 Clear Height (Over 25% More Pallet Positions

More information

Section 2.3: One Quantitative Variable: Measures of Spread

Section 2.3: One Quantitative Variable: Measures of Spread Section 2.3: One Quantitative Variable: Measures of Spread Objectives: 1) Measures of spread, variability a. Range b. Standard deviation i. Formula ii. Notation for samples and population 2) The 95% rule

More information

Scaling in Biology. How do properties of living systems change as their size is varied?

Scaling in Biology. How do properties of living systems change as their size is varied? Scaling in Biology How do properties of living systems change as their size is varied? Example: How does basal metabolic rate (heat radiation) vary as a function of an animal s body mass? Mouse Hamster

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 10 Scatterplot NA Air Pollution in the USA, and Risk Factors for Kyphosis 10.1 Introduction 10.2 Scatterplot

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

, District of Columbia

, District of Columbia State Capitals These are the State Seals of each state. Fill in the blank with the name of each states capital city. (Hint: You may find it helpful to do the word search first to refresh your memory.),

More information

North American Geography. Lesson 5: Barnstorm Like a Tennis Player!

North American Geography. Lesson 5: Barnstorm Like a Tennis Player! North American Geography Lesson 5: Barnstorm Like a Tennis Player! Unit Overview: As students work through the activities in this unit they will be introduced to the United States in general, different

More information

Chapter 6 Group Activity - SOLUTIONS

Chapter 6 Group Activity - SOLUTIONS Chapter 6 Group Activity - SOLUTIONS Group Activity Summarizing a Distribution 1. The following data are the number of credit hours taken by Math 105 students during a summer term. You will be analyzing

More information

Describing Distributions With Numbers

Describing Distributions With Numbers Describing Distributions With Numbers October 24, 2012 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Do

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile

More information

Exercises 36 CHAPTER 2/ORGANIZATION AND DESCRIPTION OF DATA

Exercises 36 CHAPTER 2/ORGANIZATION AND DESCRIPTION OF DATA 36 CHAPTER 2/ORGANIZATION AND DESCRIPTION OF DATA In the stem-and-leaf display, the column of first digits to the left of the vertical line is viewed as the stem, and the second digits as the leaves. Viewed

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Vibrancy and Property Performance of Major U.S. Employment Centers. Appendix A

Vibrancy and Property Performance of Major U.S. Employment Centers. Appendix A Appendix A DOWNTOWN VIBRANCY SCORES Atlanta 103.3 Minneapolis 152.8 Austin 112.3 Nashville 83.5 Baltimore 151.3 New Orleans 124.3 Birmingham 59.3 New York Midtown 448.6 Charlotte 94.1 Oakland 157.7 Chicago

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

University of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012

University of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012 University of California, Berkeley, Statistics 3A: Statistical Inference for the Social and Life Sciences Michael Lugo, Spring 202 Solutions to Exam Friday, March 2, 202. [5: 2+2+] Consider the stemplot

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Describing Distributions

Describing Distributions Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data CHAPTER 1 Exploring Data 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Displaying Quantitative Data

More information

American Tour: Climate Objective To introduce contour maps as data displays.

American Tour: Climate Objective To introduce contour maps as data displays. American Tour: Climate Objective To introduce contour maps as data displays. www.everydaymathonline.com epresentations etoolkit Algorithms Practice EM Facts Workshop Game Family Letters Assessment Management

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Chapters 1 & 2 Exam Review

Chapters 1 & 2 Exam Review Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the

More information

Authors: Antonella Zanobetti and Joel Schwartz

Authors: Antonella Zanobetti and Joel Schwartz Title: Mortality Displacement in the Association of Ozone with Mortality: An Analysis of 48 US Cities Authors: Antonella Zanobetti and Joel Schwartz ONLINE DATA SUPPLEMENT Additional Information on Materials

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474 Index A Absolute value explanation of, 40, 81 82 of slope of lines, 453 addition applications involving, 43 associative law for, 506 508, 570 commutative law for, 238, 505 509, 570 English phrases for,

More information

City Number Pct. 1.2 STEMS AND LEAVES

City Number Pct. 1.2 STEMS AND LEAVES 1.2 STEMS AND LEAVES Think back on the preceding example. We dealt with a list of cities giving their populations and areas. Usually the science of statistics does not concern itself with identifying the

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 The breakfast cereal data Study collected data on nutritional

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included: AP Statistics Chapter 2 Notes 2.1 Describing Location in a Distribution Percentile: The pth percentile of a distribution is the value with p percent of the observations (If your test score places you in

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Week 1: Intro to R and EDA

Week 1: Intro to R and EDA Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Investigation 11.3 Weather Maps

Investigation 11.3 Weather Maps Name: Date: Investigation 11.3 Weather Maps What can you identify weather patterns based on information read on a weather map? There have been some amazing technological advancements in the gathering and

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

Understanding the Impact of Weather for POI Recommendations

Understanding the Impact of Weather for POI Recommendations S C I E N C E P A S S I O N T E C H N O L O G Y Understanding the Impact of Weather for POI Recommendations Christoph Trattner, Alex Oberegger, Lukas Eberhard, Denis Parra, Leandro Marinho, Know-Center@Graz

More information

Lecture: Sampling and Standard Error LECTURE 8 1

Lecture: Sampling and Standard Error LECTURE 8 1 Lecture: Sampling and Standard Error 6.0002 LECTURE 8 1 Announcements Relevant reading: Chapter 17 No lecture Wednesday of next week! 6.0002 LECTURE 8 2 Recall Inferential Statistics Inferential statistics:

More information

P1: OTA/XYZ P2: ABC JWBS077-fm JWBS077-Horstmeyer July 30, :18 Printer Name: Yet to Come THE WEATHER ALMANAC

P1: OTA/XYZ P2: ABC JWBS077-fm JWBS077-Horstmeyer July 30, :18 Printer Name: Yet to Come THE WEATHER ALMANAC THE WEATHER ALMANAC THE WEATHER ALMANAC A reference guide to weather, climate, and related issues in the United States and its key cities TWELFTH EDITION Steven L. Horstmeyer A JOHN WILEY & SONS, INC.,

More information

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics

More information

Statistics and parameters

Statistics and parameters Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize

More information

RNR 516A. Computer Cartography. Spring GIS Portfolio

RNR 516A. Computer Cartography. Spring GIS Portfolio RNR 516A Computer Cartography Spring 2016 GIS Portfolio 1 Contents 1 Political and Locator Maps 3 2 Base Maps and Digitizing 4 3 Data Entry Report 5 4 Projections and Symbolization 6 5 Choropleth Mapping

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Chapter 3 Data Description

Chapter 3 Data Description Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average

More information

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 Spoiled ballots are a real threat to democracy. Below are

More information

Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2)

Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2) Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information