Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Similar documents
Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Chapter 1. Looking at Data

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

Chapter 4. Displaying and Summarizing. Quantitative Data

Elementary Statistics

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAT 200 Chapter 1 Looking at Data - Distributions

Unit 2. Describing Data: Numerical

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Histograms allow a visual interpretation

Chapter 1: Exploring Data

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

Chapter 2: Tools for Exploring Univariate Data

are the objects described by a set of data. They may be people, animals or things.

P8130: Biostatistical Methods I

CHAPTER 2: Describing Distributions with Numbers

Describing Distributions

Describing Distributions With Numbers Chapter 12

Resistant Measure - A statistic that is not affected very much by extreme observations.

Descriptive Univariate Statistics and Bivariate Correlation

Describing Distributions With Numbers

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

AP Final Review II Exploring Data (20% 30%)

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

Chapter 5: Exploring Data: Distributions Lesson Plan

Units. Exploratory Data Analysis. Variables. Student Data

Section 3. Measures of Variation

Describing distributions with numbers

Statistics for Managers using Microsoft Excel 6 th Edition

Chapter 3. Data Description

3.1 Measure of Center

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Describing distributions with numbers

Lecture 2 and Lecture 3

Stat 20: Intro to Probability and Statistics

Chapter 4.notebook. August 30, 2017

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1. Exploratory Data Analysis

1.3.1 Measuring Center: The Mean

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Chapter 3. Measuring data

Describing Distributions with Numbers

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Lecture 8: Chapter 4, Section 4 Quantitative Variables (Normal)

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

BNG 495 Capstone Design. Descriptive Statistics

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

Stat 101 Exam 1 Important Formulas and Concepts 1

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

Example 2. Given the data below, complete the chart:

2011 Pearson Education, Inc

MAT Mathematics in Today's World

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

Chapter 6 Group Activity - SOLUTIONS

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

MATH 117 Statistical Methods for Management I Chapter Three

CHAPTER 1 Exploring Data

TOPIC: Descriptive Statistics Single Variable

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Math 140 Introductory Statistics

Math 140 Introductory Statistics

Chapter 1 - Lecture 3 Measures of Location

Introduction to Statistics

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1

The Normal Distribution. Chapter 6

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

Descriptive Statistics

Week 1: Intro to R and EDA

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago

Lecture 10A: Chapter 8, Section 1 Sampling Distributions: Proportions

1 Measures of the Center of a Distribution

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Lecture 1: Descriptive Statistics

STOR 155 Introductory Statistics. Lecture 4: Displaying Distributions with Numbers (II)

Algebra Calculator Skills Inventory Solutions

1.3: Describing Quantitative Data with Numbers

Chapter 5. Understanding and Comparing. Distributions

Determining the Spread of a Distribution

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Determining the Spread of a Distribution

Chapter2 Description of samples and populations. 2.1 Introduction.

CIVL 7012/8012. Collection and Analysis of Information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Chapters 1 & 2 Exam Review

Sections 2.3 and 2.4

Lecture 2. Descriptive Statistics: Measures of Center

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

After completing this chapter, you should be able to:

Section 2.3: One Quantitative Variable: Measures of Spread

Chapter 3 Statistics for Describing, Exploring, and Comparing Data. Section 3-1: Overview. 3-2 Measures of Center. Definition. Key Concept.

Section 3.2 Measures of Central Tendency

Unit 2: Numerical Descriptive Measures

Statistics I Chapter 2: Univariate data analysis

Description of Samples and Populations

2.1 Measures of Location (P.9-11)

Transcription:

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs. Median Cengage Learning Elementary Statistics: Looking at the Big Picture 1

Looking Back: Review 4 Stages of Statistics Data Production (discussed in Lectures 1A-2B) Displaying and Summarizing Single variables: 1 cat. (Lecture 3A), 1 quantitative Relationships between 2 variables Probability Statistical Inference Elementary Statistics: Looking at the Big Picture L6.2

Example: Issues to Consider Background: Intro stat students earnings (in $1000s) previous year: 12, 3, 7, 1, [survey was anonymous]. Questions: What population do the data represent? Were responses unbiased? Responses: All students at that university, if sample was representative in terms of Probably unbiased because Looking Back: These are data production issues. Practice: 4.17 p.92 Elementary Statistics: Looking at the Big Picture L6.3

Example: More Issues to Consider Background: Intro stat students earnings (in $1000s) previous year: 12, 3, 7, 1, [survey was anonymous]. Questions: How do we summarize the data? Sample average was $3776. Can we conclude population average was less than $5000? Responses: Mean and other summaries are the focus of this part. Looking Ahead: This is an inference question, to be addressed in Part Four. Practice: 4.29c p.104 Elementary Statistics: Looking at the Big Picture L6.4

Definitions Distribution: tells all possible values of a variable and how frequently they occur Summarize distribution of a quantitative variable by telling shape, center, spread. Shape: tells which values tend to be more or less common Center: measure of what is typical in the distribution of a quantitative variable Spread: measure of how much the distribution s values vary Elementary Statistics: Looking at the Big Picture L6.5

Definitions Symmetric distribution: balanced on either side of center Skewed distribution: unbalanced (lopsided) Skewed left: has a few relatively low values Skewed right: has a few relatively high values Outliers: values noticeably far from the rest Unimodal: single-peaked Bimodal: two-peaked Uniform: all values equally common (flat shape) Normal: a particular symmetric bell-shape Elementary Statistics: Looking at the Big Picture L6.6

Displays of a Quantitative Variable Displays help see the shape of the distribution. Stemplot Advantage: most detail Disadvantage: impractical for large data sets Histogram Advantage: works well for any size data set Disadvantage: some detail lost Boxplot Advantage: shows outliers, makes comparisons Cè Q Disadvantage: much detail lost Elementary Statistics: Looking at the Big Picture L6.7

Definition Stemplot: vertical list of stems, each followed by horizontal list of one-digit leaves stems 1-digit leaves. Elementary Statistics: Looking at the Big Picture L6.8

Example: Constructing a Stemplot Background: Masses (in 1000 kg) of 20 dinosaurs: 0.0 0.0 0.1 0.2 0.4 0.6 0.7 0.7 1.0 1.1 1.1 1.2 1.5 1.7 1.7 1.8 2.9 3.2 5.0 5.6 Response: Do not skip the 4 stem: why? Long tailà -skewed. 1 peakà Most below 2000 kg, a few unusually heavy. Practice: 4.38a p.107 Elementary Statistics: Looking at the Big Picture L6.9

Modifications to Stemplots Too few stems? Split Split in 2: 1 st stem gets leaves 0-4, 2 nd gets 5-9 Split in 5: 1 st stem gets leaves 0-1, 2 nd gets 2-3, etc. Split in 10: 1 st gets 0,, 10 th gets 9. Too many stems? Truncate last digit(s). Elementary Statistics: Looking at the Big Picture L6.10

Example: Splitting Stems Background: Credits taken by 14 other students: 4 7 11 11 11 13 13 14 14 15 17 17 17 18 Questions: What shape do we guess for non-traditional (other) students? How to construct stemplot to make shape clear? Responses: Expect shape -skewed due to Stemplot: 1st attempt has too few stems 0 4 7 1 1 1 1 3 3 4 4 5 7 7 7 8 so split 2 ways: Elementary Statistics: Looking at the Big Picture L6.11

Example: Truncating Digits Background: Minutes spent on computer day before 0 10 20 30 30 30 30 45 45 60 60 60 67 90 100 120 200 240 300 420 Question: How to construct stemplot to make shape clear? Response: Stems 0 to 42 too many: truncate last digit, work with 100 s (stems) and 10 s (leaves): Skewed : most times less than 100 minutes, but a few had unusually long times. Practice: 4.25b p.92 Elementary Statistics: Looking at the Big Picture L6.12

Definition Histogram: to display quantitative values 1. Divide range of data into intervals of equal width. 2. Find count or percent or proportion in each. 3. Use horizontal axis for range of data values, vertical axis for count/percent/proportion in each. Elementary Statistics: Looking at the Big Picture L6.13

Example: Constructing a Histogram Background: Prices of 12 used upright pianos: 100 450 500 650 695 1100 1200 1200 1600 2100 2200 2300 Question: Construct a histogram for the data; what does it tell us about the shape? Response: We opted to put 500 as left endpoint of interval; be consistent (a price of 1000 would go in Practice: 4.27a p.103 Elementary Statistics: Looking at the Big Picture L6.14

Definitions (Review) Shape: tells which values tend to be more or less common Center: measure of what is typical in the distribution of a quantitative variable Spread: measure of how much the distribution s values vary Elementary Statistics: Looking at the Big Picture L6.15

Definitions Median: a measure of center: the middle for odd number of values average of middle two for even number of values Quartiles: measures of spread: 1 st Quartile (Q1) has one-fourth of data values at or below it (middle of smaller half) 3 rd Quartile (Q3) has three-fourths of data values at or below it (middle of larger half) (By hand, for odd number of values, omit median to find quartiles.) Elementary Statistics: Looking at the Big Picture L6.16

Definitions Percentile: value at or below which a given percentage of a distribution s values fall A Closer Look: Q1 is 25 th percentile, Q3 is 75 th percentile. Range: difference between maximum and minimum values Interquartile range: tells spread of middle half of data values, written IQR=Q3-Q1 Elementary Statistics: Looking at the Big Picture L6.17

Ways to Measure Center and Spread Five Number Summary: 1. Minimum 2. Q1 3. Median 4. Q3 5. Maximum Mean and Standard Deviation (more useful but less straightforward to find) Elementary Statistics: Looking at the Big Picture L6.18

Example: Finding 5 Number Summary and IQR Background: Credits taken by 14 non-traditional students: 4 7 11 11 11 13 13 14 14 15 17 17 17 18 Question: What are Five Number Summary, range, and IQR? Response: 1. Minimum: 2. Q1: 3. Median: 4. Q3: 5. Maximum: Range: IQR: Practice: 4.28b p.103 Elementary Statistics: Looking at the Big Picture L6.19

Definition The 1.5-Times-IQR Rule identifies outliers: below Q1-1.5(IQR) considered low outlier above Q3+1.5(IQR) considered high outlier Elementary Statistics: Looking at the Big Picture L6.20

Definition A boxplot displays median, quartiles, and extreme values, with special treatment for outliers: 1. Bottom whisker to minimum non-outlier 2. Bottom of box at Q1 3. Line through box at median 4. Top of box at Q3 5. Top whisker to maximum non-outlier Outliers denoted *. Elementary Statistics: Looking at the Big Picture L6.21

Example: Identifying Outliers Background: Credits taken by 14 non-traditional students had 5 No. Summary: 4, 11, 13.5, 17, 18 Questions: Are there outliers? Responses: Q1=, Q3= IQR= 1.5 IQR= Q1-1.5(IQR)= : Low outliers?. Q3+1.5(IQR)= : High outliers?. Practice: 4.28c p.103 Elementary Statistics: Looking at the Big Picture L6.22

Example: Constructing Boxplot Background: Credits taken by 14 non-traditional students had 5 No. Summary: 4, 11, 13.5, 17, 18 Question: How is the boxplot constructed? Response: Maximum=18à Q3=17à Median=13.5à Q1=11à Typical credits about 13.5, middle half between 11 and 17, shape is left-skewed Minimum 4à Practice: 4.28c p.108 Elementary Statistics: Looking at the Big Picture L6.23

Ways to Measure Center and Spread Five Number Summary (already discussed) Mean and Standard Deviation Elementary Statistics: Looking at the Big Picture L7.24

Definition Mean: the arithmetic average of values. For n sampled values, the mean is called x-bar : The mean of a population, to be discussed later, is denoted and called mu. Elementary Statistics: Looking at the Big Picture L7.25

Example: Calculating the Mean Background: Credits taken by 14 other students: 4 7 11 11 11 13 13 14 14 15 17 17 17 18 Question: How do we find the mean number of credits? Response: Practice: 4.34a p.105 Elementary Statistics: Looking at the Big Picture L7.26

Example: Mean vs. Median (Skewed Left) Background: Credits taken by 14 other students: 4 7 11 11 11 13 13 14 14 15 17 17 17 18 Question: Why is the mean (13) less than the median (13.5)? Response: Practice: 4.26d-e p.103 Elementary Statistics: Looking at the Big Picture L7.27

Example: Mean vs. Median (Skewed Right) Background: Output for students computer times: 0 10 20 30 30 30 30 45 45 60 60 60 67 90 100 120 200 240 300 420 Question: Why is the mean (97.9) more than the median (60)? Response: Practice: 4.30b p.104 Elementary Statistics: Looking at the Big Picture L7.28

Role of Shape in Mean vs. Median Symmetric: mean approximately equals median Skewed left / low outliers: mean less than median Skewed right / high outliers: mean greater than median Elementary Statistics: Looking at the Big Picture L7.29

Mean vs. Median as Summary of Center Pronounced skewness / outliers Report median. Otherwise, in general Report mean (contains more information). Elementary Statistics: Looking at the Big Picture L7.30

Lecture Summary (Quantitative Displays, Begin Summaries) Display: stemplot, histogram Shape: Symmetric or skewed? Unimodal? Normal? Center and Spread median and range, IQR mean identify outliers display with boxplot Elementary Statistics: Looking at the Big Picture L6.31