Descriptive Statistics C H A P T E R 5 P P

Similar documents
More on Variability. Overview. The Variance is sensitive to outliers. Marriage Data without Nevada

Introduction to Statistics

Statistical Methods. by Robert W. Lindeman WPI, Dept. of Computer Science

Chapter 3. Data Description

Introduction to Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics

MATH 10 INTRODUCTORY STATISTICS

Chapter 2: Tools for Exploring Univariate Data

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

MATH 117 Statistical Methods for Management I Chapter Three

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Instrumentation (cont.) Statistics vs. Parameters. Descriptive Statistics. Types of Numerical Data

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

TOPIC: Descriptive Statistics Single Variable

FREQUENCY DISTRIBUTIONS AND PERCENTILES

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

SESSION 5 Descriptive Statistics

CIVL 7012/8012. Collection and Analysis of Information

Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.2 Business Applications (III)

Elementary Statistics

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Example 2. Given the data below, complete the chart:

STAT 200 Chapter 1 Looking at Data - Distributions

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Statistics for Managers using Microsoft Excel 6 th Edition

Sets and Set notation. Algebra 2 Unit 8 Notes

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics.

1. Exploratory Data Analysis

STT 315 This lecture is based on Chapter 2 of the textbook.

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

REVIEW: Midterm Exam. Spring 2012

Measures of. U4 C 1.2 Dot plot and Histogram 2 January 15 16, 2015

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree.

are the objects described by a set of data. They may be people, animals or things.

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

Contents. 13. Graphs of Trigonometric Functions 2 Example Example

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

Statistics I Chapter 2: Univariate data analysis

P8130: Biostatistical Methods I

Chapter 3 Statistics for Describing, Exploring, and Comparing Data. Section 3-1: Overview. 3-2 Measures of Center. Definition. Key Concept.

Statistics I Chapter 2: Univariate data analysis

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values.

Statistics in medicine

8/28/2017. PSY 5101: Advanced Statistics for Psychological and Behavioral Research 1

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

MAT Mathematics in Today's World

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms.

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

A is one of the categories into which qualitative data can be classified.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

KCP e-learning. test user - ability basic maths revision. During your training, we will need to cover some ground using statistics.

Describing distributions with numbers

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Glossary for the Triola Statistics Series

CHAPTER 3. YAKUP ARI,Ph.D.(C)

Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used

Units. Exploratory Data Analysis. Variables. Student Data

Descriptive Data Summarization

Statistics and parameters

3.1 Measure of Center

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

Determining the Spread of a Distribution

Graphing Data. Example:

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Algebra 2. Outliers. Measures of Central Tendency (Mean, Median, Mode) Standard Deviation Normal Distribution (Bell Curves)

Determining the Spread of a Distribution

Describing distributions with numbers

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

Chapter 3 Data Description

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Basic Statistical Analysis

The science of learning from data.

Sociology 6Z03 Review I

Lecture 11. Data Description Estimation

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

The Empirical Rule, z-scores, and the Rare Event Approach

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Lecture 2. Descriptive Statistics: Measures of Center

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Chapter 4. Displaying and Summarizing. Quantitative Data

Analytical Graphing. lets start with the best graph ever made

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Introduction to Statistical Data Analysis Lecture 1: Working with Data Sets

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Chapter2 Description of samples and populations. 2.1 Introduction.

22S:105 Statistical Methods and Computing. Graphical Depiction of Qualitative and Quantitative Data and Measures of Central Tendency

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

The Normal Distribution. MDM4U Unit 6 Lesson 2

Stat 20: Intro to Probability and Statistics

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Notation Measures of Location Measures of Dispersion Standardization Proportions for Categorical Variables Measures of Association Outliers

Determining the Spread of a Distribution Variance & Standard Deviation

Transcription:

Descriptive Statistics C H A P T E R 5 P P 1 1 0-130

Graphing data Frequency distributions Bar graphs Qualitative variable (categories) Bars don t touch Histograms Frequency polygons Quantitative variable (ordinal, interval, or ratio scale) Others: Pie chart Stem and leaf Scatterplot

Example grade distribution Class interval frequency distribution A 1 A- 2 B+ 3 B 7 B- 8 C+ 6 C 3 C- 2 D 1 F 1 N = 34

Number per 100,000 population Graphs! Read X and Y axis carefully Death Rates in America 120 100 80 Age 1-4 Age 15-24 60 1980 1981 1982 1983 40 1984 1985 Year 1986 1987 1988 1989

Example bar graph Rauscher, Shaw, & Ky (1993). Mozart Effect 119 111 110 N = 36 college students

Lots of cool graphs! Florence Nightingale s coxcomb diagram Blue: died of sickness; Red: died of wounds; Black: died of other causes

Graph interpretation Careful to read values on each axis graphs can be deceiving! Reminiscence bump Recency effect

Descriptive statistics Data collected in a study = raw data Reports of a study = summary data Descriptive statistics provide that summary Measures of central tendency Describe middleness of distribution of scores Mean Median Mode Measures of variation Describe width or dispersion of a distribution Range Standard deviation Variance

Descriptive statistics Measure of central tendency Mean Mean for population = sum of scores # of scores in distribution μ = ΣX N Mean for sample = sum of scores # scores in distribution M or X = ΣX N

Mean as the balance point The mean balances the distances (or deviations) of all scores Scores (x) 2 2 6 10 X = 20 N = 4 M = 5 Mean 5 5 5 5 Distance from mean -3-3 1 5 X = 0

Effect of changing 1 score X = X / N = M = 26 29 31 32 34 35 38 40 42 83 390 390 / 10 39 X = X / N = M = 26 29 31 32 34 35 38 40 42 33 340 340 / 10 34 The mean is not a robust statistic It is highly influenced by a single outlier score

Adding a constant 26 29 31 32 34 35 38 40 42 83 X + 10 36 39 41 42 44 45 48 50 52 93 390 390 / 10 39 490 490 / 10 49 If you add, subtract, multiply or divide all scores by constant: The same change is made to M

Descriptive statistics Measure of central tendency Mean Mean for population = sum of scores # of scores in distribution µ = X / N Mean for sample = sum of scores # scores in distribution X or M = X / N Median Middle score in distribution Order scores from highest to lowest If N is even number, average the two middle scores

Calculating the median for RTs scores 512 587 590 578 567 533 573 529 577 572 572 591 575 577 534 Median Mean sorted 512 529 533 534 567 572 572 573 575 577 577 578 587 590 591 573 564.47 Add Hi X 512 587 590 578 567 533 573 529 899 572 572 591 575 577 534 573 585.93 Add Lo X 512 587 590 578 567 533 573 529 177 572 572 591 575 577 534 572 537.80 Median is a robust statistic!

Descriptive statistics Measure of central tendency Mean Mean for population = sum of scores / # of scores in distribution µ = X / N Mean for sample = sum of scores / # scores in distribution X or M = X / N Median Middle score in distribution Order scores from highest to lowest If N is even number, average the two middle scores Mode Score that occurs with greatest frequency

Example grade distribution A 1 A- 2 B+ 3 B 7 B- 8 C+ 6 C 3 C- 2 D 1 F 1 N = 34 M = 80.38 Median = 81 Mode = B-

Can have 2+ modes Sample grade distribution with 2 modes 7 6 5 4 3 2 1 0 A A- B+ B B- C+ C C- D F

Types of distributions Normal distribution Bell-shaped Symmetrical Only 1 mode Mean, median, mode all equal Kurtosis: spread of distribution How flat or peaked Mesokurtic: medium peak (like normal distribution) Leptokurtic: tall and thin Platykurtic: flat and broad

Measures of central tendency Indicators of the shape of the distribution How mean, median, and mode change w/ shape of distribution Normal distribution Positive skew Tail to positive scores Negative skew Tail to negative scores Positive skew Negative skew

Which measure of central tendency to use? If interval or ratio data and normally-distributed Use mean If interval or ratio data and there are outliers or a skewed-distribution Use median If nominal data Use mode But, that s not enough info

Measures of variation Range Difference between lowest and highest scores in a distribution = Maximum score minimum score Easily distorted by an outlier (low or high score) Standard deviation Average distance of scores in a distribution from the mean If sum deviations from mean = zero! SO Average deviation: Use absolute values Standard deviation: Use squared deviation scores For population: σ = Σ(X μ)2 N

Example grade distribution A 1 A- 2 B+ 3 B 7 B- 8 C+ 6 C 3 C- 2 D 1 F 1 N = 34 M = 80.38 Median = 81 Mode = B- s = 7.92 M - s = 72.5 M = 80.38 M + s = 88.3 Note: most scores are w/in 8 pts of mean

Calculating standard deviation (σ) 1. Calculate deviation score (score mean) 2. Square deviations 3. Sum squared deviations 4. Divide by N 1. N = # of scores 2. This step = variance 5. Take square root of value RTs x - M (x - M) 2 Avg = 512-52.47 2753.101 587 22.53 507.6009 590 25.53 651.7809 578 13.53 183.0609 567 2.53 6.4009 533-31.47 990.3609 573 8.53 72.7609 529-35.47 1258.121 577 12.53 157.0009 572 7.53 56.7009 572 7.53 56.7009 591 26.53 703.8409 575 10.53 110.8809 577 12.53 157.0009 534-30.47 928.4209 564.4667 8593.734 sum of (X-M) 2 572.9156 Variance: sum divided by N 23.93565 SD: square root of sum/n

Calculating standard deviation (s) 1. Calculate deviation score (score mean) 2. Square deviations 3. Sum squared deviations 4. Divide by N or N - 1 1. This step = variance 2. Use N for population 3. Use N-1 to estimate population from sample 5. Take square root of value RTs x - M (x - M) 2 Avg = 512-52.47 2753.101 587 22.53 507.6009 590 25.53 651.7809 578 13.53 183.0609 567 2.53 6.4009 533-31.47 990.3609 573 8.53 72.7609 529-35.47 1258.121 577 12.53 157.0009 572 7.53 56.7009 572 7.53 56.7009 591 26.53 703.8409 575 10.53 110.8809 577 12.53 157.0009 534-30.47 928.4209 564.4667 8593.734 sum of (X-M) 2 sd = 613.8381 Variance: sum divided by N-1 24.77576 = 24.77576 SD: square root of sum/n-1

Measures of variation Standard deviation of population σ = Σ(X μ)2 N Standard deviation of sample (when estimating population) s = Variance Σ(X M)2 N 1 Population = σ 2 = Σ(X μ)2 N or sample = s 2 = Σ(X μ)2 N

Why use N 1? Sample is less variable than the population Divide by smaller # so yields more conservative estimate of variance or SD Makes variance score larger Use n-1 so can make conclusions about population (not just describe your sample)

Thank you, Excel! For example, if data is in column B from row 1 to 20 Sum: =sum(b1:b20) Mean: =average(b1:b20) Median: =median(b1:b20) Mode: =mode(b1:b20) Maximum score: =max(b1:b20) Minimum score: =min(b1:b20) Range: Subtract Max score from Min score