Statistical Methods: Introduction, Applications, Histograms, Ch

Similar documents
Sets and Set notation. Algebra 2 Unit 8 Notes

Statistical Concepts. Constructing a Trend Plot

1. AN INTRODUCTION TO DESCRIPTIVE STATISTICS. No great deed, private or public, has ever been undertaken in a bliss of certainty.

Histograms, Central Tendency, and Variability

Background to Statistics

Statistical Concepts. Distributions of Data

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11.

MATH 10 INTRODUCTORY STATISTICS

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

Example A. Define X = number of heads in ten tosses of a coin. What are the values that X may assume?

1 Normal Distribution.

Example. If 4 tickets are drawn with replacement from ,

The science of learning from data.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

Probability Distributions

Descriptive Statistics

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Chapter 3. Data Description

Essentials of Statistics and Probability

1 Probability Distributions

STAT 200 Chapter 1 Looking at Data - Distributions

CS 5014: Research Methods in Computer Science. Statistics: The Basic Idea. Statistics Questions (1) Statistics Questions (2) Clifford A.

Discrete & Continuous Probability Distributions Sunu Wibirama

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

SUMMARIZING MEASURED DATA. Gaia Maselli

MATH 1150 Chapter 2 Notation and Terminology

Section 5.4. Ken Ueda

are the objects described by a set of data. They may be people, animals or things.

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Section 3.4 Normal Distribution MDM4U Jensen

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur

Basic Probability Reference Sheet

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Statistics 100A Homework 5 Solutions

Numerical Methods Lecture 7 - Statistics, Probability and Reliability

Introduction to Statistics

Toss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2

Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Chapter 1: Exploring Data

Probability Distributions

AMS 5 NUMERICAL DESCRIPTIVE METHODS

University of Jordan Fall 2009/2010 Department of Mathematics

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

Unit 2. Describing Data: Numerical

Lecture 3. Measures of Relative Standing and. Exploratory Data Analysis (EDA)

Chapter 4a Probability Models

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Statistics and Data Analysis in Geology

Binomial random variable

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

All the men living in Turkey can be a population. The average height of these men can be a population parameter

Do students sleep the recommended 8 hours a night on average?

Algebra 2. Outliers. Measures of Central Tendency (Mean, Median, Mode) Standard Deviation Normal Distribution (Bell Curves)

To find the median, find the 40 th quartile and the 70 th quartile (which are easily found at y=1 and y=2, respectively). Then we interpolate:

Chapter 2: Tools for Exploring Univariate Data

Techniques for Improving Process and Product Quality in the Wood Products Industry: An Overview of Statistical Process Control

Homework Example Chapter 1 Similar to Problem #14

Introduction to Basic Statistics Version 2

Chapter 5: Exploring Data: Distributions Lesson Plan

Week 1: Intro to R and EDA

STT 315 This lecture is based on Chapter 2 of the textbook.

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Elementary Statistics

Statistics I Chapter 2: Univariate data analysis

Experiment 1: The Same or Not The Same?

Mean/Average Median Mode Range

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Resistant Measure - A statistic that is not affected very much by extreme observations.

Histograms allow a visual interpretation

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Statistics I Chapter 2: Univariate data analysis

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Descriptive Data Summarization

Confidence Intervals

Sampling Populations limited in the scope enumerate

Statistics, continued

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution


Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

APS Eighth Grade Math District Benchmark Assessment NM Math Standards Alignment

The Normal Distribution. Chapter 6

Glossary for the Triola Statistics Series

Chapter 4. Displaying and Summarizing. Quantitative Data

Ch. 17. DETERMINATION OF SAMPLE SIZE

1. Exploratory Data Analysis

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Measures of Central Tendency

Unit 4 Probability. Dr Mahmoud Alhussami

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

P8130: Biostatistical Methods I

TOPIC: Descriptive Statistics Single Variable

MAT Mathematics in Today's World

Introduction. Probability and distributions

Higher Secondary - First year STATISTICS Practical Book

Chapter 0: Some basic preliminaries

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Men. Women. Men. Men. Women. Women

A random variable is a variable whose value is determined by the outcome of some chance experiment. In general, each outcome of an experiment can be

Transcription:

Outlines Statistical Methods: Introduction, Applications, Histograms, Characteristics November 4, 2004

Outlines Part I: Statistical Methods: Introduction and Applications Part II: Statistical Methods: Histograms, Characteristics Statistical Methods Introduction Application: How Often Will This Elevator Fail?

Outlines Part I: Statistical Methods: Introduction and Applications Part II: Statistical Methods: Histograms, Characteristics Statistical Methods Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry

Introduction Application: How Often Will This Elevator Fail? Part I Statistical Methods: Introduction and Applications

Introduction Application: How Often Will This Elevator Fail? Introduction Engineering and scientific work often requires the collection of measurements or results from experiments. Most of the time, you can t test every product. Sometimes because of the expense or time required to perform the test, sometimes because the test is destructive. But if the data collected is accurate and representative of all the tested and untested products, you can use your that data to draw conclusions about the population as a whole.

Introduction Application: How Often Will This Elevator Fail? Application/Example: How Often Will This Elevator Fail? Simplified version: an elevator is supported by a wire rope that can withstand a stress of 30,000 psi. The load on the elevator causes a stress in the rope of 25,000 psi. Given this information alone, does the rope break? No. As long as the stress from the elevator load is less than the rope strength, we re ok. No lawsuits, no death or dismemberment, no need for safety devices, no problem. Unfortunately, if you leave your analysis at this level, you re very likely to have a big component failure at some point. Let s try a less simplified example.

Introduction Application: How Often Will This Elevator Fail? Application/Example: How Often Will This Elevator Fail? Statistical version: an elevator is supported by a wire rope that can withstand a stress of mean stress of 30,000 psi with a standard deviation of 1000 psi. The rope supplier isn t perfect, and can t provide ropes that have dead-on textbook values for the material s yield and ultimate strengths 100% of the time. The load on the elevator is random, and causes a stress on the rope with a mean value of 25,000 psi and a standard deviation of 2000 psi. Without getting ahead of ourselves, let s tentatively define the mean as the value that a measurement centers around, and that the measurements will be within 2σ of the mean around 95% of the time, where σ is the standard deviation. So, given all this extra detail, are there situations where the rope can break?

Introduction Application: How Often Will This Elevator Fail? Application/Example: How Often Will This Elevator Fail? We can definitely find situations where the elevator can fail. Around 2.5% of the time, the stress on the rope will be higher than 29,000 psi. Around 2.5% of the time, the rope will be 28,000 psi or weaker. If you happen to subject an unusually weak rope to an unusually heavy load, it will break. Qualitatively, if you plotted the probability distribution of the rope strength and the stress caused by the elevator load, wherever they overlapped would be where failures could occur.

Introduction Application: How Often Will This Elevator Fail? Application/Example: How Often Will This Elevator Fail? Example 6.1 Distribution of Load Distribution of Strength Chance of Failure 2 2.5 3 3.5 x 10 4

Introduction Application: How Often Will This Elevator Fail? Statistics. A branch of mathematics dealing with the scientific method of collecting, analyzing, and displaying data. It also helps in developing models for describing and predicting physical phenomena. Population. The collection of all possible objects with a common measureable or observable feature or characteristic (strength, diameter, defects, etc.). Sample. A group of randomly-selected members of the population. All members of the sample are measured or tested in some repeatable way. Sample size. The number of members in the sample. Generally, larger sample sizes provide more accurate descriptions, and allow for more accurate predictions.

Introduction Application: How Often Will This Elevator Fail? Experiment. The act of performing some thing where the exact outcome is not known. Examples: measuring the diameter of a shaft, tossing a coin, or examining a part for particular types of defects. Event. The outcome of an experiment. Some experiments have finite numbers of discrete outcomes (0 defects found, 1 defect found, 2 or more defects found). Others have an infinite number of real-valued outcomes (shaft diameter of 1.994 in, 1.989 in, 2.011 in, ). Random variable. The measured or observed characteristic of a sample member. If the characteristic isn t really random, it s not a good candidate for statistical methods.

Introduction Application: How Often Will This Elevator Fail? Probability. The chance of a particular event or range of events occurring. The probability of tossing a coin and it coming up heads is 50%. The probability of finding a shaft with a diameter less than 1.990 inches might be 3%, for example.

Histograms and Probability Distribution Functions Part II Statistical Methods: Histograms, Characteristics

Histograms and Probability Distribution Functions Histograms Consider a random variable X that take on m discrete values. If the experiment is repeated n times, you ll find that the variable takes the values x 1, x 2, x 3,, x m a total of n 1, n 2, n 3,, n m times, respectively. A bar chart with values of x i on the horizontal axis and n i on the vertical axis is a histogram or probability mass function.

Histograms and Probability Distribution Functions Probability Density Function Consider a random variable X that be classified into m discrete classes. Each class has a lower limit of x i x 2 and an upper limit of x i + x 2. If the experiment is repeated n times, you ll find that the variable falls into class 1, 2, 3,, m a total of n 1, n 2, n 3,, n m times, respectively. A bar chart with values of x i on the horizontal axis and n i n on the vertical axis is a probability density function.

Histograms and Probability Distribution Functions Probability Density Function 14 Histogram (7 bins) Occurrences 12 10 8 6 4 2 0 27 28 29 30 31 32 33 0.35 0.3 % Occurrences 0.25 0.2 0.15 0.1 0.05 0 27 28 29 30 31 32 33 Yield Strength (kpsi)

Histograms and Probability Distribution Functions Cumulative Distribution Function The cumulative distribution is found by adding up the bars on the probability density function from x 1 to x i. 1 Cumulative Distribution Function 0.9 0.8 0.7 Cumulative Probability 0.6 0.5 0.4 0.3 0.2 0.1 0 27 28 29 30 31 32 33 Yield Strength (kpsi)

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Central Tendency Mean. Also known as the average, the mean of a set of n data values x 1, x 2,, x n is calculated as X = 1 n n i=1 x i If the data are already separated into m classes, as for a histogram, compute the mean as X = 1 n m x im n i i=1 where x im is the midpoint of class i and n i is the number of values of x in class i.

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Central Tendency Median. The median is the value of X where values above and below it occur with equal probability. If n sorted data points are available, then X med can be determined as the middle value of X if n is odd, and the average of the two middle values of X if n is even.

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Central Tendency Mode. The value of a random variable that is most likely to occur. If the data are separated into m classes, the mode of X can be determined as X mod = x i with largest value of n i (i = 1, 2,, m) Modes are not as useful as means or medians in many cases. If the data X is a continuous measurement rather than a discrete one, the choice of class boundaries can change the location of the mode.

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Dispersion The sample variance of a random variable X is denoted by s 2, and is a function of the deviations of the x values from the mean X : { s 2 = 1 n n } (x i X ) 2 = 1 xi 2 n X 2 n 1 n 1 i=1 The sample standard deviation of the random variable is the positive square root of s 2 : s = 1 n (x i X ) n 1 2 i=1 i=1

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Dispersion The population variance and population standard deviation of a random variable X are used when the data represents the entire population rather than a smaller sample. They are denoted with a Greek lowercase sigma (σ) instead of lowercase s : σ 2 = 1 n (x i X ) 2 n σ = 1 n i=1 n (x i X ) 2 i=1

Histograms and Probability Distribution Functions Measures of Central Tendency Measures of Dispersion Measures of Symmetry Measures of Symmetry A skewed distribution is not symmetric, and many of the tests and procedures shown next time will not work properly on a skewed distribution. The basic skewness is calculated as Skewness = 1 n 1 n (x i X ) 3 i=1 and a skewness coefficient γ is defined as γ = Skewness s 3

Histograms and Probability Distribution Functions Work Problem 6.3. Experiment with histograms of this data in MATLAB. Assuming that you have your data in a vector variable named beamloads, the following code will make a histogram with bins centered on bincenters: bincenters =130: 180; [n,x]= hist ( beamloads, bincenters ); bar (x,n,1); Can you find a value for bincenters that has a bell-shaped curve, similar to one of the graphs in Figure 6.5?