What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

Size: px
Start display at page:

Download "What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty."

Transcription

1 What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection, 1) organization, summarization and providing an overview of the general features of data, 2) and the drawing of inferences about a body of data (population) based on the properties of a part of the data (sample) observed. Population: the collection of all individuals of interest in a statistical study. Sample: any subset of individuals from the population. Branches of statistics Data 1) Descriptive statistics 2) Inferential statistics Data: the facts and figures that are collected, and analyzed in a statistical study. Individuals: are objects described by a set of data. [the entities on which data are collected] Variable: a characteristic of interest for the individual which takes on different values in different individual. Quantitative Variables (can be measured) [height, number of subscriptions,...] Qualitative (Categorical) Variables [hair color, gender,...] Discrete: a variable that has a countable number of possible values.(counts) Continuous: a variable that has an uncountable number of possible values.(measurements) Measurement Scales Nominal: consists of labels, names or categories. Ordinal: data that the order or rank is meaningful. Interval: numerical data that arithmetic operations are meaningful. Ratio: data that the ratio of two data is meaningful. Univariate data set consists of observations on a single variable. Multivariate data set consists of observations on more than one variable. Bivariate data set is a special multivariate case that consists of two variables.

2 Producing Data Type of Studies Observational Study: conditions to which subjects are exposed are not controlled by the investigator. (no attempt is made to control or influence the variables of interest) Experimental Study: conditions to which subjects are exposed to are controlled by the investigator. (treatments are used in order to observe the response) Completely Randomized Design: a design of experiment in which all the experimental units are allocated at random among all the treatments. Data Collection Techniques: Personal interview, Telephone interview, Self-administrated questionnaire (by person, mail, internet,...), Direct observation,... Sources of Data: Routinely kept records Surveys Designed Experiments External sources (internet, public or private organizations)... Preparing Data: Retain the raw data source Study log: data received, investigation, statisticians, data source Machine readable data base (save original files) Data cleaning (logic check, correcting, clarifying) Sampling Techniques: (To obtain representative sample) Simple random sampling:a sample obtained from a population in such a manner that all samples of the same size have equal likelihood of being selected. Stratified random sampling: Classify the population into at least two strata, then draw a sample from each. Cluster sampling: Divide the population into sections, randomly select few of those sections, and then choose all members in them. Systematic sample: Select every i-th member in the population according to their ID. Convenient sample: Use results that are already available or using data that is convenient to obtain.... Bias Response bias (influenced response, lies, ) Nonresponse bias

3 Descriptive Statistics Grouping, Describing and Displaying Data. Numerical Summary. [What type of statistical analysis is appropriate?] Categorical variable? Quantitative variable? Data Sheet ID Height Weight BirthMonth Exp. Gender H F H F T M H F T F H M H F H M T M H M T M T F H M T F H M H F T F T M H M T M H M H M Grouping and Displaying Categorical Data Frequency Table and Charts Class Frequency Relative Frequency Female 9 9/22 =.409 = 40.9% Male 13 13/22 =.591 = 59.1% Total % Female 40.9% Male 59.1% Pe rcent 10 0 Female Male sex

4 Grouping and Displaying Quantitative Data Frequency Table and Histogram Frequency Distribution Table (From data sheet) Class Frequency Relative Cumulative R.F. Freq < /22 =.091 2/ < /22 =.091 4/ < /22 =.136 7/ < /22 =.091 9/ < /22 = / < /22 = / < /22 = / < /22 = / < /22 = / < /22 = /22 Total Classes: Categories for grouping data. Frequency(class frequency): The number of data values in a class. Relative frequency: The ratio of the frequency of a class to the total number of pieces of data. Frequency distribution: A listing of classes and their frequencies. Relative Frequency distribution: A listing of classes and their relative frequencies. Upper class limit: The largest value that can go in a class. Lower class limit: The smallest value that can go in a class. Class width: The difference between the lower class limit of the given class and the lower class limit of the next higher class. Class midpoint (class mark): The midpoint of a class. Guidelines for grouping (quantitative) data: There should be between five and twenty classes. Each piece of data must belong to one, and only one, class. Whenever feasible, all classes should have the same width. Histogram (SPSS) Std. D ev = Mean = N = WEIGHT

5 (Excel ouput) What to observe in Histograms? Outliers: observations that stand out from the rest for some reason. Center: the middle of the data. Spread: the range; the extent of the data; how far the values are from each other. Shape: distribution pattern. [Skewness, symmetry, uniform,...] Symmetric (Bell) shape Skewed to the right, or positively skewed Bimodal Uniform Skewed to the left, or negatively skewed

6 Stemplots (or Stem-and-leaf plots) -- leading digits are called stems -- final digits are called leaves Splitting stemplots Back-to-back stemplots (What if the data are fractions?) Example: (number of hysterectomies performed by 15 male doctors) 27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, For ten female doctors, the data are 5, 7, 10, 14, 18, 19, 25, 29, 31, Example: Number of hysterectomies performed by 15 male doctors: 27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28 for ten female doctors, the data are: 5, 7, 10, 14, 18, 19, 25, 29, 31, 33 (Male) (Female) Back-to-back stem-plot

7 Example: (Height data with gender) Female: 60, 63, 64, 65, 65, 65, 66, 67 Male: 61, 64, 69, 71, 71, 71, 72, 72, 72, 72, 73, 74, 75 (See data sheet) Female Male Back-to-back Female Male Split-back-to-back 430 6* 14 * => # 9 # => 5-9 7* # 5 Summarizing categorical data: Frequency distribution, relative frequency distribution,... Bar chart, Pie chart, Pareto chart,... Summarizing quantitative data: Frequency distribution, relative frequency distribution,... Histogram, Stemplot (Stem-and-leaf plot), box plot, dot plot,... When examining the data through graphs we should be on the alert for unusual values that do not follow the pattern of the rest of the data, some sense of a central or typical value of the data, some sense of how spread out or variable the data are, some sense of the shape of the overall pattern, in timeplots, be on the lookout for trends over time. Describing distribution with mathematical functions Density function, p(x) (Discrete distribution) x (Continuous distribution) Density, f (x) x

8 Examine Bivariate Data Categorical variables BW * SMOKE Crosstabulation SMOKE BW Normal Count Non-Smoke Smoke Total % of Total 79.0% 16.1% 95.2% Low birth weight Count % of Total 3.6% 1.2% 4.8% Total Count % of Total 82.6% 17.4% 100.0% Percent SMOKE 20 Non-Smoke 0 Normal Low birth weight Smoke Birth Weight Quantitative variables HEIGHT sex 62 Male 60 Female Total Population WEIGHT

9 Describing Distributions with Numbers Measuring center Mean: the average value of the data. If n observations are denoted by x 1, x 2,..., x n, their (sample) mean is x = x + x x n = n 1 2 n 1 n i= 1 x i Example: Birth weights (in lb) of 5 babies born from a group of women under certain diet. 7, 6, 8, 7, 7 => mean = ( ) / 5 = 35/5 = 7 [near the center of the data set] Example: (number of hysterectomies performed by 15 male doctors) 27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28 => mean = For ten female doctors, the data are 5, 7, 10, 14, 18, 19, 25, 29, 31, 33 mean =? Median: of a data set is the data value exactly in the middle of its ordered list if the number of pieces of data is odd, the mean of the two middle data values in its ordered list if the number of pieces of data is even. [median is not influenced by outliers and is best for nonsymmetric distribution] Example: (number of hysterectomies performed by 15 doctors) 27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28 ordered list => 20, 25, 25, 27, 28, 31, 33, 34, 36, 37, 44, 50, 59, 85, 86 median = 34 Example: (Birth weights for 6 infants.) 5, 7, 6, 8, 5, 9 ordered list => 5, 5, 6, 7, 8, 9 median = (6+7) / 2 = 6.5

10 Mode: of a data set is the observation that occurs most frequently. Example: (number of hysterectomies performed by 15 doctors) 27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28 ordered list => 20, 25, 25, 27, 28, 31, 33, 34, 36, 37, 44, 50, 59, 85, 86 Mode = 25 What is the Modal class? Measuring spread (variability) Range = largest data value smallest data value Sample from group I(diet program I): 7, 6, 8, 7, 7 => mean = ( ) / 5 = 35/5 = 7 Sample from group II(diet program II): 3, 4, 8, 9, 11=> mean = ( ) / 5 = 35/5 = 7 range of I = 8-6 = 2 range of II = 11-3 = 8 Is there any difference between the two samples? Quartiles: The first quartile, Q 1, or 25 th percentile, is the median of the lower half of the list of ordered observations. The third quartile, Q 3, or 75 th percentile, is the median of the upper half of the list of ordered observations. [percentiles?] Interquartile range (IQR) = Q 3 Q 1 Example: (data sheet) [odd number of data] 60,61,63,64,64,65,65,65,66,67,69,71,71,71,72,72,72,72,73,74,75 IQR = = 7.5 Q 1 = 64.5 Median = 69 Q 3 = 72 [even number of data] 6, 60,61,63,64,64,65,65,65,66,67, 69,71,71,71,72,72,72,72,73,74,75 IQR =? Q 1 =? Median =? Q 3 =?

11 Variance and Standard Deviation If n observations are denoted by x 1, x 2,..., x n, their variance and standard deviation are Sample Variance: s 2 = n i= 1 ( x i x) n 1 2, Sample Standard Deviation: s = n i= 1 ( x i x) n 1 2 Example: Birth weights (in lb) of 5 babies born from a group of women under diet program II. Weight data, x 3, 4, 8, 9, 11 Deviation from mean Absolute Deviation from mean = = = = = = = = = = 4 16 Total Square of deviation Sample Variance = 46/4 = 11.5 lb, Sample Standard Deviation = 46 / 4 = 3.39 lb. What is the standard deviation of the weights of babies from the sample of mothers who received diet program I? Does the mother s diet program affect the birth weights of babies? Coefficient of Variation (C.V.): is the standard deviation expressed as a percentage of the mean. It is a unit-free measure of dispersion. It provides a measurement for comparing relative variability of data sets from different scales. s 100 x C.V. = % Example: As part of the Berkeley Guidance Study. The heights (in cm) and weights (in kg) of 13 girls were measured at age two. The average height was 86.6 cm with a s.d.= 2.9 and average weight is 12.6 kg with a s.d.= 1.4. C.V. (height) = (2.9/86.6)100% = 3.3% C.V. (weight) = (1.4/12.6)100% = 11.1% More variability is weight than in height. Which of the following data has relatively higher variability? Systolic blood pressures: 113, 124, 125, 133, 146, 151, 169 Body temperature: 97.2, 97.5, 97.6, 97.9, 98.1, 98.2, 98.5

12 The five-number summary 1. Minimum value 2. Q 1 3. Median 4. Q 3 5. Maximum value Example: (data sheet without outlier 6 ) 80 The five-number summary and boxplot Min = 60, Q 1 = 64.5, Median = 69, Q 3 = 72, Max = 75. Boxplot (Box-and-Whisker Plot) N = 21 HEIGHT Interquartile range (IQR) = Q 3 Q 1 Inner and outer fences The inner fences are located at a distance of 1.5 IQR below Q 1 and at a distance of 1.5 IQR above Q 3. The outer fences are located at a distance of 3 IQR below Q 1 and at a distance of 3 IQR above Q 3. Mild and Extreme outliers Data values falling between the inner and outer fences are considered mild outliers. Data values falling outside the outer fences are considered extreme outliers. When outliers exist, the whisker extended to the smallest and largest data values within the inner fence.

13 N = 1 22 HEIGHT Side-by-side boxplot HEIGHT 50 N = 8 Female 13 Male sex

14 About s (sample standard deviation) : s measures the spread around the mean. the larger s is, the more spread out the data are. if s = 0, then all the observations must be equal. s is strongly influenced by outliers. Remarks: If the distribution of the data is symmetric, then the mean and median will be the same. The five-number summary is best for nonsymmetric data. The median and quartiles are not influenced by outliers. The mean and standard deviation are most appropriate to use only if the data are symmetric because both of these measures are easily influenced by outliers. Properties of a symmetric and bell-shaped (Normal) distribution: the distribution is symmetric about it mean (µ), 68% of the area is between µ σ and µ + σ, 95% of the area is between µ 2σ and µ + 2σ, 99.7% of the area is between µ 3σ and µ + 3σ. If x is an observation from a distribution that has mean µ, and standard deviation σ, the standardized value of x is, z-score of x : z = x µ σ µ + 3σ has a z-score 3, since it is 3 s.d. from mean. If a distribution has a mean 10 and a s.d. 2 and the value 7 has a z-score 1.5 = (7 10)/2. Chebychev s inequality There is at least 1 (1/k 2 ) of the data in a data set lie within k standard deviation of their mean. Example: Heart rates for asthmatic patients in a state of respiratory arrest has a mean of 140 beats per minute and a standard deviation of 35.5 beats per minute. With Chebychev s inequality, we know there are at least 75% of the patients that had heart rates between two standard deviations of the mean, (i.e., 140-2x35.5 = 69 and 140+2x35.5 = 211) in a state of respiratory arrest. Because, 1 (1/2 2 ) = ¾ = 75%.

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Data Description. Measure of Central Tendency. Data Description. Chapter x i Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that? Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 1- part 1: Describing variation, and graphical presentation Outline Sources of variation Types of variables Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

8/4/2009. Describing Data with Graphs

8/4/2009. Describing Data with Graphs Describing Data with Graphs 1 A variable is a characteristic that changes or varies over time and/or for different individuals or objects under consideration. Examples: Hair color, white blood cell count,

More information

Exploring, summarizing and presenting data. Berghold, IMI, MUG

Exploring, summarizing and presenting data. Berghold, IMI, MUG Exploring, summarizing and presenting data Example Patient Nr Gender Age Weight Height PAVK-Grade W alking Distance Physical Functioning Scale Total Cholesterol Triglycerides 01 m 65 90 185 II b 200 70

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004 UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 50W - Introduction to Biostatistics Fall 00 Exercises with Solutions Topic Summarizing Data Due: Monday September 7, 00 READINGS.

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

Description of Samples and Populations

Description of Samples and Populations Description of Samples and Populations Random Variables Data are generated by some underlying random process or phenomenon. Any datum (data point) represents the outcome of a random variable. We represent

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Unit 2: Numerical Descriptive Measures

Unit 2: Numerical Descriptive Measures Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Section 1.1. Data - Collections of observations (such as measurements, genders, survey responses, etc.)

Section 1.1. Data - Collections of observations (such as measurements, genders, survey responses, etc.) Section 1.1 Statistics - The science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Sociology 6Z03 Review I

Sociology 6Z03 Review I Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing

More information

The science of learning from data.

The science of learning from data. STATISTICS (PART 1) The science of learning from data. Numerical facts Collection of methods for planning experiments, obtaining data and organizing, analyzing, interpreting and drawing the conclusions

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific

More information

Descriptive Statistics Example

Descriptive Statistics Example Descriptive tatistics Example A manufacturer is investigating the operating life of laptop computer batteries. The following data are available. Life (min.) Life (min.) Life (min.) Life (min.) 130 145

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

After completing this chapter, you should be able to:

After completing this chapter, you should be able to: Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Descriptive Statistics Methods of organizing and summarizing any data/information.

Descriptive Statistics Methods of organizing and summarizing any data/information. Introductory Statistics, 10 th ed. by Neil A. Weiss Chapter 1 The Nature of Statistics 1.1 Statistics Basics There are lies, damn lies, and statistics - Mark Twain Descriptive Statistics Methods of organizing

More information

Clinical Research Module: Biostatistics

Clinical Research Module: Biostatistics Clinical Research Module: Biostatistics Lecture 1 Alberto Nettel-Aguirre, PhD, PStat These lecture notes based on others developed by Drs. Peter Faris, Sarah Rose Luz Palacios-Derflingher and myself Who

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

2 Descriptive Statistics

2 Descriptive Statistics 2 Descriptive Statistics Reading: SW Chapter 2, Sections 1-6 A natural first step towards answering a research question is for the experimenter to design a study or experiment to collect data from the

More information

Lecture 1 : Basic Statistical Measures

Lecture 1 : Basic Statistical Measures Lecture 1 : Basic Statistical Measures Jonathan Marchini October 11, 2004 In this lecture we will learn about different types of data encountered in practice different ways of plotting data to explore

More information

Summarising numerical data

Summarising numerical data 2 Core: Data analysis Chapter 2 Summarising numerical data 42 Core Chapter 2 Summarising numerical data 2A Dot plots and stem plots Even when we have constructed a frequency table, or a histogram to display

More information

UCLA STAT 10 Statistical Reasoning - Midterm Review Solutions Observational Studies, Designed Experiments & Surveys

UCLA STAT 10 Statistical Reasoning - Midterm Review Solutions Observational Studies, Designed Experiments & Surveys UCLA STAT 10 Statistical Reasoning - Midterm Review Solutions Observational Studies, Designed Experiments & Surveys.. 1. (i) The treatment being compared is: (ii). (5) 3. (3) 4. (4) Study 1: the number

More information

REVIEW: Midterm Exam. Spring 2012

REVIEW: Midterm Exam. Spring 2012 REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic

More information

FREQUENCY DISTRIBUTIONS AND PERCENTILES

FREQUENCY DISTRIBUTIONS AND PERCENTILES FREQUENCY DISTRIBUTIONS AND PERCENTILES New Statistical Notation Frequency (f): the number of times a score occurs N: sample size Simple Frequency Distributions Raw Scores The scores that we have directly

More information

BIOS 2041: Introduction to Statistical Methods

BIOS 2041: Introduction to Statistical Methods BIOS 2041: Introduction to Statistical Methods Abdus S Wahed* *Some of the materials in this chapter has been adapted from Dr. John Wilson s lecture notes for the same course. Chapter 0 2 Chapter 1 Introduction

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

Introduction to Statistics

Introduction to Statistics Why Statistics? Introduction to Statistics To develop an appreciation for variability and how it effects products and processes. Study methods that can be used to help solve problems, build knowledge and

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

Frequency Distribution Cross-Tabulation

Frequency Distribution Cross-Tabulation Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape

More information

Full file at

Full file at IV SOLUTIONS TO EXERCISES Note: Exercises whose answers are given in the back of the textbook are denoted by the symbol. CHAPTER Description of Samples and Populations Note: Exercises whose answers are

More information

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3.1 Week 1 Review Creativity is more than just being different. Anybody can plan weird; that s easy. What s hard is to be

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010 Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010 Review Recording observations - Must extract that which is to be analyzed: coding systems,

More information

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz Measures of Central Tendency and their dispersion and applications Acknowledgement: Dr Muslima Ejaz LEARNING OBJECTIVES: Compute and distinguish between the uses of measures of central tendency: mean,

More information