CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Similar documents
Chapter 2 Descriptive Statistics

Median and IQR The median is the value which divides the ordered data values in half.

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

(# x) 2 n. (" x) 2 = 30 2 = 900. = sum. " x 2 = =174. " x. Chapter 12. Quick math overview. #(x " x ) 2 = # x 2 "

Data Description. Measure of Central Tendency. Data Description. Chapter x i

2: Describing Data with Numerical Measures

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers


ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

Parameter, Statistic and Random Samples

Summarizing Data. Major Properties of Numerical Data

Formulas and Tables for Gerstman

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

MEASURES OF DISPERSION (VARIABILITY)

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Sample Size Determination (Two or More Samples)

Topic 10: Introduction to Estimation

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

MATH/STAT 352: Lecture 15

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

(6) Fundamental Sampling Distribution and Data Discription

Sampling Distributions, Z-Tests, Power

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Binomial Distribution

Measures of Spread: Variance and Standard Deviation

Tables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc.

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Describing the Relation between Two Variables

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Chapter 4 - Summarizing Numerical Data

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 8: Estimating with Confidence

1 Lesson 6: Measure of Variation

Probability and statistics: basic terms

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Confidence Intervals

Statistical Intervals for a Single Sample

STAT 203 Chapter 18 Sampling Distribution Models

Statistics 511 Additional Materials

Final Examination Solutions 17/6/2010

Topic 9: Sampling Distributions of Estimators

4.1 Sigma Notation and Riemann Sums

Expectation and Variance of a random variable

Census. Mean. µ = x 1 + x x n n

MA131 - Analysis 1. Workbook 2 Sequences I

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Statistical Fundamentals and Control Charts

Estimating the Population Mean - when a sample average is calculated we can create an interval centered on this average

Sequences I. Chapter Introduction

Elementary Statistics

Infinite Sequences and Series

Measures of Variation Cumulative Fequency Box and Whisker Plots Standard Deviation

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Lecture 18: Sampling distributions

Computing Confidence Intervals for Sample Data

Frequentist Inference

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

(7 One- and Two-Sample Estimation Problem )

Statisticians use the word population to refer the total number of (potential) observations under consideration

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

(Figure 2.9), we observe x. and we write. (b) as x, x 1. and we write. We say that the line y 0 is a horizontal asymptote of the graph of f.

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Paired Data and Linear Correlation

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

The Poisson Distribution

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Questions about the Assignment. Describing Data: Distributions and Relationships. Measures of Spread Standard Deviation. One Quantitative Variable

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Chapter 6 Principles of Data Reduction

Linear Regression Models

Lecture 7: Properties of Random Samples

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

CS284A: Representations and Algorithms in Molecular Biology

Sampling Distribution of Differences

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

For nominal data, we use mode to describe the central location instead of using sample mean/median.

Transcription:

CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio. umerical descriptive measures associated with a populatio of measuremet are called parameters; those computed from sample measuremets are called statistics. Measures of Locatio Mea This is the usual arithmetic mea or average ad is equal to the sum of the measuremets divided by umber of measuremets. Sample mea = X = i=1 X i Populatio mea = µ = i=1 X i Media This is the middle of the measuremets whe ordered them. The positio of the media = + 1 2 Mode The mode is measuremet which occurs most frequetly. ote: Mea ad media are equal whe distributio of data is symmetric, mea is greater whe distributio is skewed to right ad is less tha media whe distributio is skewed to left. Example: The prices for 14 differet brads of water-packed light tua are 0.99, 1.92,1.23, 0.85, 0.65, 0.53, 1.41, 1.12, 0.63, 0.67, 0.69, 0.60, 0.60, 0.66 a. Fid the average price for the 14 differet brads of tua. b. Fid the media price for the 14 differet brads of tua. 1

c. Based o your fidigs i parts a ad b, do you thik that the distributio of prices is skewed? - Measures of Variability Data sets may have the same ceter but look differet because of the way the umbers spread out from the ceter. Measures of variability ca help you create a metal picture of the spread of data. Rage R=largest measuremet - smallest measuremet Variace It measures the average deviatio of the measuremets about their mea Populatio variace : σ 2 = i=1 (X i µ) 2 = i=1 X2 i ( i=1 X i) 2 Sample variace : s 2 = i=1 (X i X) 2 1 = i=1 X2 i ( i=1 X i) 2 1 ote: Xi 2 = sum of squares of measuremets ad ( X i ) 2 = square of the sum of measuremets., Stadard deviatio Populatio stadard deviatio : σ = σ 2 Sample stadard deviatio : s = s 2 Example: You are give = 8 measuremets: 3, 1, 5, 6, 4, 4, 3, 5. a. Calculate the rage. b. Calculate the sample mea. c. Calculate the sample variace ad stadard deviatio. d. Compare the rage ad the stadard deviatio. The rage is approximately how may stadard deviatios? 2

Approximately R 4s or s R 4 Usig Measures of Ceter ad Spread, Tchebysheff s Theorem Give a umber k greater tha or equal to 1 ad a set of measuremets, at least 1 ( 1 k 2 ) of the measuremet will lie withi k stadard deviatios of the mea. Ca be used for either samples ( X ad s) or for a populatio (µ ad σ). Importat Result: If k = 2, at least 1 1/2 2 = 3/4 of the measuremets are withi 2 stadard deviatios of the mea. If k = 3, at least 1 1/3 2 = 8/9 of the measuremets are withi 3 stadard deviatios of the mea. Usig Measures of Ceter ad Spread, The Empirical Rule Give a distributio of measuremets that approximately moud-shaped: The iterval µ ± σ cotais approximately 68% of the measuremets. The iterval µ ± 2σ cotais approximately 95% of the measuremets. The iterval µ ± 3σ cotais approximately 99.7% of the measuremets. Examples: The ages of 50 teured faculty at a state uiversity are 34, 48, 70, 63, 52, 52, 35, 50, 37, 43, 53, 43, 52, 44, 42, 31, 36, 48, 43, 26, 58, 62, 49, 34, 48, 53, 39, 45, 34, 59, 34, 66, 40, 59, 36, 41, 35, 36, 62, 34, 38, 28, 43, 50, 30, 43, 32, 44, 58, 53 1. Do the data agree with those give by Tchebysheff s Theorem? 2. Do they agree with the Empirical Rule? Why? The legth of time for a worker to complete a specified operatio averages 12.8 miutes with a stadard deviatio of 1.7 miutes. If the distributio of times is approximately moud-shaped, what proportio of workers will take loger tha 16.2 miutes to complete the task? 3

Measures of Relative Stadig Where does oe particular measuremet stad i relatio to the other measuremets i the set of data? z-score How may stadard deviatios away from the mea does the measuremet lie? This is measured by the z-score. The sample z-score defied by z score = x x s z-score betwee -2 ad 2 are ot uusual. z-score should ot be more tha 3 i absolute value. z-scores larger tha 3 i absolute value would idicate a possible outlier. Percetiles How may measuremets lie below the measuremet of iterest? This is measured by p th percetile. p th percetile is the value of measuremet that is more tha p% of the measuremets i ordered data. Quartiles Lower quartile (first quartile): 25 th percetile. (Q 1 ), is the value of x which is larger tha 25% ad less tha 75% of the ordered measuremets. The positio of the first quartile = 0.25( + 1) Upper quartile (third quartile): 75 th percetile. (Q 3 ), is the value of x which is larger tha 75% ad less tha 25% of the ordered measuremets. The positio of the third quartile = 0.75( + 1) It is obvious that secod quartile is media which i the other had is 50 th percetile. ote that, if the positio of quartile are ot a iteger, eed some modificatio. Iterquartile rage: The rage of the middle 50 th of the measuremets IQR = Q 3 Q 1 Examples: The prices of 18 brads of walkig shoes: 50, 60, 65, 65, 65, 68, 68, 70, 70, 70, 70, 70, 70, 74, 75, 75, 90, 95 Fid IQR ad media. 4

Box plot Box plot describes ceter of data, how spread the data, the exted ad ature of ay departure from symmetry, ad idetificatio of outliers. I geeral, box plot is based o five umber summary: Smallest value, first quartile, media, third quartile, largest value. Costructig box plot Calculate five umber summary ad also the IQR. Show five umbers o horizotal lie ad draw a box above the horizotal lie from Q 1 to Q 3 ad determie media by a vertical lie through the box. Draw 2 vertical lies from lower fece ad upper fece Lower fece = Q 1 1.5(IQR) Upper fece = Q 3 + 1.5(IQR) Determie the outliers (ay observatio beyod the feces) by. Draw two horizotal lies from the ed of the box to largest ad smallest observatios which are ot outliers (whiskers). Iterpretig Box plot Media lie i ceter of box ad whiskers od equal legth- symmetric distributio Media lie left of ceter ad log right whisker- skewed right Media lie right of ceter ad log left whisker- skewed left - Example: Costruct a box plot for these data ad idetify ay outliers: 25, 22, 26, 23, 27, 26, 28, 18, 25, 24, 12 - Suggested Exercises: 2.5, 2.11, 2.13, 2.15, 2.17, 2.19, 2.21, 2.27, 2.31, 2.35, 2.43, 2.49, 2.53, 2.57, 2.63, 2.65, 2.81 5