Chapter 4: Displaying and Summarizing Quantitative Data
|
|
- Cornelius Byrd
- 5 years ago
- Views:
Transcription
1 Chapter 4: Displaying and Summarizing Quantitative Data This chapter discusses methods of displaying quantitative data. The objective is describe the distribution of the data. The figure below shows three idealized distributions spanning a range of values from 0 to 10. There are a few key features of a distribution that convey nearly all of the information contained in the data. They are 1. The center and most common values (modes).. The spread of the distribution. 3. The shape. Shapes are symmetric (black), skewed right (red), and skewed left (blue). 4. Number of modes. The lower figure on this page shows a distribution with two modes Deviations from overall shape and outliers. Outliers are values that are far away from the majority of values. Stemplot: A stemplot is a plot which shows the distribution and includes the numerical values on the plot. The method for drawing a stemplot is illustrated with the following example. Example: How effective is antibacterial soap? To investigate this question, data were collected on the number of bacterial colonies present on previously sterile media plates two days after placing a hand washed with water (group 1) and antibacterial soap (grouop ) on the plates. The objective is to compare the two distributions of bacterial colonies. Specifically, the aim is to determine whether the antibacterial soap plates have fewer colonies that the water-only plates. More generally, the objective is to characterize the distributions with respect to the center, spread and shape. 18
2 The data on number of colonies are Group 1: Washed with water Group : Washed with soap To make a stemplot, the rightmost digit of a number becomes a leaf, and the remaining digits become a stem. The rightmost digit is written on the stem. The stemplot constructed from the anti-bacterial soap plates is on the right. The distribution of the number of 7 7 colonies is skewed to the right and is centered around 105 colonies. There is one large outlier (189). Most of the distribution is between 87 and colonies Notes: 14 9 Decimals points are omitted from the stemplot. The spacing between values remains the same along a row of the stemplot The advantage of a stemplot is that it is simple and quick and is the useful for handdrawn distributional displays. Also, the individual data are retained. Its disadvantage is that it is awkward or impossible to use with large data sets. A visual comparison of the bacterial counts for the antibacterial soap and the water-only plates is obtained from a back-to-back stemplot. The back-to-back stemplot uses a common stem with leaves branching off in opposite directions, as shown below. 19
3 Water only Antibacterial soap The back-to-back stemplot stemplot reveals 5 9 that the water-only distribution is shifted towards larger values and is right-skewed. The antibacterial distribution is not skewed Another simple method appropriate only for small data sets is the dotplot. It shows individual data values along with an identifier of the observational unit. The dotplot below shows annual precipitation for 70 large U.S. cities. Mobile Miami San Juan New Orleans Juneau Jacksonville Jackson Memphis Little Rock Atlanta Houston Columbia Nashville Atlantic City Norfolk Hartford Louisville Providence Charlotte Richmond Raleigh Boston Baltimore Charleston New York Wilmington Philadelphia Cincinati Washington Seattle Tacoma Indianapolis Portland Columbus Kansas City Pittsburg Concord Buffalo Dallas St Louis Peoria Cleveland Chicago Albany Burlington Sault Ste. Marie Oklahoma City Detroit Des Moines Wichita Omaha Duluth Milwaukee Minneapolis/St Paul Sioux Falls Honolulu San Francisco Spokane Sacramento Bismark Salt Lake City Great Falls Cheyenne Los Angeles Denver Boise El Paso Albuquerque Reno Phoenix Inches per year 0
4 A histogram is a plot which breaks the range of the data (smallest to largest) into intervals and displays the frequency (or relative frequency) of the observations that fall into each interval. Histograms are the usual approach to displaying the distribution of large data sets. Example: The annual precipitation data are summarized in a table and a histogram (below). The table and histogram show the number of cities with annual precipitation means falling in each of seven intervals. Interval Relative A histogram is constructed by (Inches) Frequency Frequency forming intervals (categories, actually), of 0 10 inches, inches and so on, and counting the number of values that belong to each interval The choice of intervals is somewhat subjective, and some trial and error may be necessary to produce a good histogram. Statistical software are usually good at choosing appropriate intervals. The histogram reveals that the distribution of precipitation for the 70 cities is roughly unimodal and centered near 36 inches. Most of the values are between 10 and 60. (The dotplot reveals that 10 to 53 is a better characterization of spread of the data). Frequency Remarks: Inches per year The interval widths are equal. This is crucial for obtaining a histogram which accurately reflects the distribution. Intervals and bars are contiguous. 1
5 Relative frequency (or percentage or proportion) could be used for the bar heights instead of the frequency. The primary advantage of the histogram over the stemplot is that arbitrarily large data sets can be be displayed using a histogram. The primary disadvantage of the histogram is that it does not retain the actual data values in the plot. Every graph (not just histograms) must have labels for the horizontal axis, and almost always for the vertical axis. When examining a graph, the first place you should look is at the axis labels. Distributional shapes: The important characteristics of shape are 1. Mode: The term mode is synonymous with peak. A distribution with one peak is said to be unimodal; a distribution with two peaks is bimodal; and a distribution with more than two peaks is called multimodal. If a distribution does not have a distinct mode, the distribution is said to be uniform, or approximately uniform.. Skewness: Skew describes the length of the tails of the distribution. A unimodal distribution will either be symmetric, have a longer tail toward larger values and be called skewed to the right, or have a longer tail toward smaller values and be called skewed to the left. The first figure on page 18 illustrates skew. 3. Unusual features: Unusual features of a distribution often tell something interesting about the data (and the population from which they were sampled). Unusual features are principally outliers 1 (values unusually distant from the bulk of the distribution). Numerical summaries of quantitative data There are two primary features of the distribution of a quantitative variable: the center and spread. Statistics used to describe center, spread, and the shape are tabled to the right. Feature Center Spread Shape Measures Median Mean Midrange Mode Interquartile range (IQR) Standard deviation Range 5-Number summary 1 A rule for identifying outliers will be introduced later.
6 Measures of center 1. The median (M) of a distribution is a value which divides the ordered data values into two sets equal in number. To find the median (a) Order the data from smallest to largest. (b) Determine the number of data values (usually called the sample size and denoted by n). Note whether n is even or odd. i. If n is odd, then the median is the value at the middle point in the ordered list; specifically the median is the value at the n+1 th position. ii. If n is even, the median is between the two middle values of the ordered list. These are located at position n and n + 1. The convention is to take the average of these two middle values. Example: Return to the annual precipitation data: (a) The sample size is n = 70, and so the median is the average of the 35th and 36th smallest values. They are 36. and 37.0 (Pittsburg and Kansas City). The average of these values is the median; hence M = = 36.6 inches. (b) Suppose that we include Missoula in the data set. The average annual precipitation for Missoula is 13.7 inches. What is the median of this set of data?. The mean of a set of data is the average. Let y 1, y,..., y n denote n data values. The mean is y = 1 n n i=1 y i = 1 n (y 1 + y + + y n ) = y y n. n The symbol y often is referred to as y-bar. The notation 1 n y sometimes is used for the mean. The mean of the precipitation data is y = = 34.9 inches. With Missoula included, the mean is 34.6 inches. After including Missoula, n = 71, so the n+1 = 36th smallest ordered value is the median. Pittsburg was 35th; now it s 36th and so the new median is 36. inches. 3
7 3. The mode of a data set is the value that occurs most frequently. If there are multiple values that occur most frequently, all of these values are modes of the distribution. The mode generally is not an appropriate measure of center. To illustrate, I ve rounded the precipitation data to the nearest integer, and used to these data construct the histogram to the below. The modes are 36 and 43 inches. The characterization of the center by these modes is not particularly informative. The histogram also uninformative as the intervals are too small to adequately portray the shape of the data. A statistic is resistant if it is not substantially affected by Frequency changes in the numerical values of a small proportion of the observations Annual precipitation (inches) Outliers and long tails often substantially affect a statistic that is not resistant. A statistic which is not resistant to some distributional feature is sensitive to that feature. Sensitivity to outliers The median is resistant to the effects of outliers whereas the mean is sensitive. To illustrate, consider the effects of accidentally recording 670 inches for Mobile instead of The mean would have been computed as 43.5 inches, but the median (36.6 inches) would not be different. The effect of skew differs between mean and median, specifically, the mean is shifted toward a long tail compared to the median. (The median resists being shifted toward a long tail). Symmetric: M = y; Skewed right: M < y; Skewed left: y < M. 4
8 Percentiles and quartiles The p th percentile of a distribution is that value such that p% of the data values fall below it. If your SAT math percentile was 80%, then your score was larger than 80% of all scores (and smaller than 0%). The quartiles are the 5 th, 50 th and 75 th percentiles; and so they divide the data set into four sets of equal size. The notation is Measures of spread Q 1 = 5 th percentile = 1 st quartile M = 50 th percentile = nd quartile, or the median Q 3 = 75 th percentile = 3 rd quartile 1. The range of the data is the difference between the maximum and minimum values: Range = Max Min. The range is too sensitive to outliers to be of much use.. The interquartile range is the distance between the 5 th and 75 th percentiles; hence, Remarks: IQR = Q 3 Q 1. The IQR measures the spread of the middle 50% of the data. The IQR is a single number, not Q 1 and Q 3. Instead, it is the width of the interval between Q 1 and Q 3. There are several algorithms for finding the quartiles, all of which find the median and use it to divide the data into upper and lower halves and find the medians of each half; these medians are the quartiles. Deveaux et al. recommend this algorithm: If n is even 3, then include the n smallest value in the lower data half and include the n + 1 smallest value in the upper data. If n is odd4, the include the median (the n + 1 smallest value) in both the lower and upper halves of the data. 3 then the median is the average of two middle values 4 then the median is the single middle value 5
9 To illustrate, consider the following data on mean temperature, by month, for Missoula (units are degrees Fahrenheit). Month J F M A M J J A S O N D Temp The ordered values are Order Temp Then, and Q 1 = = 9.8 degrees Q 3 = = degrees IQR = = 8.05 degrees The monthly temperature quartiles for San Francisco are Q 1 = 53.0, Q = and Q 3 = The following table compares Missoula, San Francisco, and one other city. Table 1: Measures of center and spread for Missoula and San Francisco (degrees Fahrenheit). City M IQR Missoula San Francisco ? The figure to the right is a time plot of the mean monthly temperature against month. From this Figure, the constancy of temperatures in San Francisco becomes strikingly obvious. It is also apparent that the mystery city is generally colder than Missoula and that the greatest difference in mean monthly temperatures occurs in the summer and winter. degrees F Missoula San Francisco? Month
10 3. The standard deviation is the most commonly used numerical summary of distributional spread. It is (roughly) the average difference between the mean y and the data values. Recall: The values in a data set are denoted y 1, y,..., y n for a sample size of n. The standard deviation is computed from the deviations of the observed values from the mean, namely: y 1 y, y y,..., y n y. Since the deviations sum to 0, the average deviation is not a measure of spread. To rectify this problem, the standard deviation is computed from the squared deviations (which are all greater than or equal to 0). The squared deviations are summed and divided by n 1. The final operation computes the square root. A formula for the standard deviation is (y y) s = n 1. This is roughly the average distance of the data values from the mean, which is a logical measure of spread. The term roughly is used because n 1 is the denominator rather than n. Taking the square root puts it back in the original units of measure. Squaring the standard deviation gives the variance. The relationship between the two are summarized by var = s s = var A related, alternative measure is the median absolute deviation about the median: 1 yi M. n 7
11 Example: To illustrate, the standard deviation of monthly temperature averages from Vostok, Antarctica 5 (elevation: 110 feet, latitude: 78 7S, longitude: 106 5E) is computed: First, the annual mean temperature is y = 68.5 degrees F. The three columns on the right show the intermediate steps; the formula on the left shows the last stages of the calculation. n i=1 s = (y i y) n = 11 = = 3.9 degrees F s = 3.9 degrees F is interpreted as the average difference between the annual mean temperature (y = 68.5) and the monthly mean temperatures. For comparison, the standard deviation y i (y i y) (y i y) Total of the Missoula temperatures was s = 16.1 degrees F. There s considerably greater variability in month mean temperatures at Vostok compared to Missoula. The 5-number summary numerically summarizes the shape of distribution. It is (Min, Q 1, M, Q 3, Max). To compare the three cities more closely, their 5-number summaries are: City Min Q 1 M Q 3 Max? Missoula San Francisco The mystery city does appear colder than Missoula throughout the year. 5 The lowest recorded temperature in 3 years of records was 17 degrees F. For comparison, the freezing point of CO is degrees F. 8
12 Summary of resistant and sensitive measures: The mean is sensitive to the effects of outliers, whereas the median is resistant to the effects of outliers. The standard deviation is sensitive to the effects of outliers whereas the IQR is resistant to the effects of outliers. The IQR is resistant because it is the difference between Q 3 and Q 1. No outliers are used in the calculation of the IQR since unusually large observations have little effect Q 3 and Q 1. In contrast, all data values (including outliers) are used in the calculation of the standard deviation. Summarizing a distribution with a measure of center and spread Since the mean and standard deviation are not resistant, they are not appropriate for skewed distributions or distributions with outliers. They re most appropriate for symmetric distributions with no outliers. Situation Symmetric distribution with no outliers Skewed distributions Symmetric distributions with outliers Measures to use Mean and SD or median and IQR Median and IQR Median and IQR 9
Kathryn Robinson. Grades 3-5. From the Just Turn & Share Centers Series VOLUME 12
1 2 From the Just Turn & Share Centers Series VOLUME 12 Temperature TM From the Just Turn & Share Centers Series Kathryn Robinson 3 4 M Enterprises WriteMath Enterprises 2303 Marseille Ct. Suite 104 Valrico,
More informationCHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.
(c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals
More informationare the objects described by a set of data. They may be people, animals or things.
( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The
More informationMath 140 Introductory Statistics
Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The
More informationQUANTITATIVE DATA. UNIVARIATE DATA data for one variable
QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE
More informationHI SUMMER WORK
HI-201 2018-2019 SUMMER WORK This packet belongs to: Dear Dual Enrollment Student, May 7 th, 2018 Dual Enrollment United States History is a challenging adventure. Though the year holds countless hours
More informationElementary Statistics
Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:
More informationSolving Quadratic Equations by Graphing 6.1. ft /sec. The height of the arrow h(t) in terms
Quadratic Function f ( x) ax bx c Solving Quadratic Equations by Graphing 6.1 Write each in quadratic form. Example 1 f ( x) 3( x + ) Example Graph f ( x) x + 6 x + 8 Example 3 An arrow is shot upward
More informationLab Activity: Weather Variables
Name: Date: Period: Weather The Physical Setting: Earth Science Lab Activity: Weather Variables INTRODUCTION: A meteorologist is an individual with specialized education who uses scientific principles
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationSTAT 200 Chapter 1 Looking at Data - Distributions
STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the
More informationLecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:
Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots
More informationNAWIC. National Association of Women in Construction. Membership Report. August 2009
NAWIC National Association of Women in Construction Membership Report August 2009 Core Purpose: To enhance the success of women in the construction industry Region 1 67 Gr Washington, DC 9 16 2 3 1 0 0
More informationUnits. Exploratory Data Analysis. Variables. Student Data
Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as
More informationResearch Update: Race and Male Joblessness in Milwaukee: 2008
Research Update: Race and Male Joblessness in Milwaukee: 2008 by: Marc V. Levine University of Wisconsin Milwaukee Center for Economic Development Briefing Paper September 2009 Overview Over the past decade,
More informationLecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning
More information1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.
1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions
More informationChapter 2: Tools for Exploring Univariate Data
Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is
More informationCHAPTER 1. Introduction
CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing
More information1. Evaluation of maximum daily temperature
1. Evaluation of maximum daily temperature The cumulative distribution of maximum daily temperature is shown in Figure S1. Overall, among all of the 23 states, the cumulative distributions of daily maximum
More informationHistograms allow a visual interpretation
Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationLecture 1: Descriptive Statistics
Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics
More informationFurther Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data
Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)
More informationJUPITER MILLER BUSINESS CENTER 746,400 SF FOR LEASE
746,400 SF FOR LEASE Three LEED Certified Cross-Dock Buildings 54,600 Square Feet to 746,400 Square Feet Available Dallas City of Tax Incentives Available 36 Clear Height (Over 25% More Pallet Positions
More informationSection 2.3: One Quantitative Variable: Measures of Spread
Section 2.3: One Quantitative Variable: Measures of Spread Objectives: 1) Measures of spread, variability a. Range b. Standard deviation i. Formula ii. Notation for samples and population 2) The 95% rule
More informationScaling in Biology. How do properties of living systems change as their size is varied?
Scaling in Biology How do properties of living systems change as their size is varied? Example: How does basal metabolic rate (heat radiation) vary as a function of an animal s body mass? Mouse Hamster
More informationMath 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency
Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:
More informationA Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 10 Scatterplot NA Air Pollution in the USA, and Risk Factors for Kyphosis 10.1 Introduction 10.2 Scatterplot
More informationLecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.
More informationChapter 4.notebook. August 30, 2017
Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread
More informationF78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives
F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested
More informationIntroduction to Statistics
Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,
More informationChapter 5. Understanding and Comparing. Distributions
STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume
More informationDescribing Distributions with Numbers
Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)
More information, District of Columbia
State Capitals These are the State Seals of each state. Fill in the blank with the name of each states capital city. (Hint: You may find it helpful to do the word search first to refresh your memory.),
More informationNorth American Geography. Lesson 5: Barnstorm Like a Tennis Player!
North American Geography Lesson 5: Barnstorm Like a Tennis Player! Unit Overview: As students work through the activities in this unit they will be introduced to the United States in general, different
More informationChapter 6 Group Activity - SOLUTIONS
Chapter 6 Group Activity - SOLUTIONS Group Activity Summarizing a Distribution 1. The following data are the number of credit hours taken by Math 105 students during a summer term. You will be analyzing
More informationDescribing Distributions With Numbers
Describing Distributions With Numbers October 24, 2012 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Do
More informationChapter 1. Looking at Data
Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,
More informationMath 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore
Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile
More informationExercises 36 CHAPTER 2/ORGANIZATION AND DESCRIPTION OF DATA
36 CHAPTER 2/ORGANIZATION AND DESCRIPTION OF DATA In the stem-and-leaf display, the column of first digits to the left of the vertical line is viewed as the stem, and the second digits as the leaves. Viewed
More informationSection 3.2 Measures of Central Tendency
Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationVibrancy and Property Performance of Major U.S. Employment Centers. Appendix A
Appendix A DOWNTOWN VIBRANCY SCORES Atlanta 103.3 Minneapolis 152.8 Austin 112.3 Nashville 83.5 Baltimore 151.3 New Orleans 124.3 Birmingham 59.3 New York Midtown 448.6 Charlotte 94.1 Oakland 157.7 Chicago
More informationMeasures of center. The mean The mean of a distribution is the arithmetic average of the observations:
Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number
More informationChapter2 Description of samples and populations. 2.1 Introduction.
Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that
More informationObjective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.
Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The
More informationChapter 5: Exploring Data: Distributions Lesson Plan
Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The
More informationAP Final Review II Exploring Data (20% 30%)
AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures
More informationLecture 2 and Lecture 3
Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.
More information3.1 Measure of Center
3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationUniversity of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012
University of California, Berkeley, Statistics 3A: Statistical Inference for the Social and Life Sciences Michael Lugo, Spring 202 Solutions to Exam Friday, March 2, 202. [5: 2+2+] Consider the stemplot
More informationResistant Measure - A statistic that is not affected very much by extreme observations.
Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)
More informationWhat is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected
What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types
More informationDescribing Distributions
Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?
More informationLast Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics
Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationExercises from Chapter 3, Section 1
Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median
More informationSTT 315 This lecture is based on Chapter 2 of the textbook.
STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their
More informationCHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Displaying Quantitative Data
More informationAmerican Tour: Climate Objective To introduce contour maps as data displays.
American Tour: Climate Objective To introduce contour maps as data displays. www.everydaymathonline.com epresentations etoolkit Algorithms Practice EM Facts Workshop Game Family Letters Assessment Management
More informationDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationCHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring
More informationChapters 1 & 2 Exam Review
Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the
More informationAuthors: Antonella Zanobetti and Joel Schwartz
Title: Mortality Displacement in the Association of Ozone with Mortality: An Analysis of 48 US Cities Authors: Antonella Zanobetti and Joel Schwartz ONLINE DATA SUPPLEMENT Additional Information on Materials
More informationMATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.
MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?
More informationIndex I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474
Index A Absolute value explanation of, 40, 81 82 of slope of lines, 453 addition applications involving, 43 associative law for, 506 508, 570 commutative law for, 238, 505 509, 570 English phrases for,
More informationCity Number Pct. 1.2 STEMS AND LEAVES
1.2 STEMS AND LEAVES Think back on the preceding example. We dealt with a list of cities giving their populations and areas. Usually the science of statistics does not concern itself with identifying the
More informationDescriptive Statistics
Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationChapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.
Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 The breakfast cereal data Study collected data on nutritional
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationPercentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:
AP Statistics Chapter 2 Notes 2.1 Describing Location in a Distribution Percentile: The pth percentile of a distribution is the value with p percent of the observations (If your test score places you in
More informationTopic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!
Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1
More informationWeek 1: Intro to R and EDA
Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for
More informationShape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays
Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed
More informationWhat is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.
What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,
More information1.3.1 Measuring Center: The Mean
1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations
More informationInvestigation 11.3 Weather Maps
Name: Date: Investigation 11.3 Weather Maps What can you identify weather patterns based on information read on a weather map? There have been some amazing technological advancements in the gathering and
More informationLecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #
Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures
More informationUnderstanding the Impact of Weather for POI Recommendations
S C I E N C E P A S S I O N T E C H N O L O G Y Understanding the Impact of Weather for POI Recommendations Christoph Trattner, Alex Oberegger, Lukas Eberhard, Denis Parra, Leandro Marinho, Know-Center@Graz
More informationLecture: Sampling and Standard Error LECTURE 8 1
Lecture: Sampling and Standard Error 6.0002 LECTURE 8 1 Announcements Relevant reading: Chapter 17 No lecture Wednesday of next week! 6.0002 LECTURE 8 2 Recall Inferential Statistics Inferential statistics:
More informationP1: OTA/XYZ P2: ABC JWBS077-fm JWBS077-Horstmeyer July 30, :18 Printer Name: Yet to Come THE WEATHER ALMANAC
THE WEATHER ALMANAC THE WEATHER ALMANAC A reference guide to weather, climate, and related issues in the United States and its key cities TWELFTH EDITION Steven L. Horstmeyer A JOHN WILEY & SONS, INC.,
More information2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS
Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics
More informationStatistics and parameters
Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize
More informationRNR 516A. Computer Cartography. Spring GIS Portfolio
RNR 516A Computer Cartography Spring 2016 GIS Portfolio 1 Contents 1 Political and Locator Maps 3 2 Base Maps and Digitizing 4 3 Data Entry Report 5 4 Projections and Symbolization 6 5 Choropleth Mapping
More informationPerformance of fourth-grade students on an agility test
Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding
More informationChapter 3 Data Description
Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average
More informationChapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.
Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 Spoiled ballots are a real threat to democracy. Below are
More informationMath 3339 Homework 2 (Chapter 2, 9.1 & 9.2)
Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.
More information