The sample mean and sample variance are given by: x sample standard deviation Excel: STDEV(values)

Similar documents
Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.

Chapter 4. Probability and Statistics. Probability and Statistics

Descriptive Statistics

Stat 231 Exam 2 Fall 2013

Precision Correcting for Random Error

How to Describe Accuracy

Quadratics NOTES.notebook November 02, 2017

Chapter 27 Summary Inferences for Regression

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

L06. Chapter 6: Continuous Probability Distributions

Two-Sample Inferential Statistics

12. Quadratics NOTES.notebook September 21, 2017

Percentage point z /2

Probability & Statistics

UNIT 4 MATHEMATICAL METHODS SAMPLE REFERENCE MATERIALS

Proposed Procedures for Determining the Method Detection Limit and Minimum Level

Central Limit Theorem Confidence Intervals Worked example #6. July 24, 2017

Beautiful homework # 4 ENGR 323 CESSNA Page 1/5

Measurement And Uncertainty

TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS

Statistics for Managers Using Microsoft Excel 5th Edition

The t-statistic. Student s t Test

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY CORRELATION OF EXPERIMENTAL DATA

Numbers and Data Analysis

Reporting Measurement and Uncertainty

STATISTICAL DATA ANALYSIS IN EXCEL

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Review of Statistics 101

Introduction to Statistical Data Analysis Lecture 5: Confidence Intervals

Part 7: Glossary Overview

Chapters 4-6: Estimation

CHAPTER 10 Comparing Two Populations or Groups

Pre-Lab: Primer on Experimental Errors

EDEXCEL ANALYTICAL METHODS FOR ENGINEERS H1 UNIT 2 - NQF LEVEL 4 OUTCOME 4 - STATISTICS AND PROBABILITY TUTORIAL 3 LINEAR REGRESSION

Confidence Intervals for Population Mean

Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

MAT Mathematics in Today's World

MARKSCHEME SPECIMEN MATHEMATICS

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

x find all of the symmetry operations/elements: o character table headings: E, 2C φ σ v, i, S φ C 2 φ

ERRATA. MATHEMATICS FOR THE INTERNATIONAL STUDENT MATHEMATICS HL (Core) (3rd edition)

Correlation. Martin Bland. Correlation. Correlation coefficient. Clinical Biostatistics

Expression arrays, normalization, and error models

ANALYTICAL CHEMISTRY - CLUTCH 1E CH STATISTICS, QUALITY ASSURANCE AND CALIBRATION METHODS

An example to illustrate frequentist and Bayesian approches

that relative errors are dimensionless. When reporting relative errors it is usual to multiply the fractional error by 100 and report it as a percenta

Module 1: Introduction to Experimental Techniques Lecture 6: Uncertainty analysis. The Lecture Contains: Uncertainity Analysis

Physics Lab 1 - Measurements

Chapter 4: Newton's Laws of Motion

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Business Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals

Nonparametric tests, Bootstrapping

Survey of Smoking Behavior. Samples and Elements. Survey of Smoking Behavior. Samples and Elements

Ch. 17. DETERMINATION OF SAMPLE SIZE

CHAPTER 7 THE SAMPLING DISTRIBUTION OF THE MEAN. 7.1 Sampling Error; The need for Sampling Distributions

Business Statistics Midterm Exam Fall 2015 Russell. Please sign here to acknowledge

Statistical Inference for Means

SUCCEEDING IN THE VCE 2017 UNIT 3 SPECIALIST MATHEMATICS STUDENT SOLUTIONS

Chapter 23: Inferences About Means

Data Analysis and Statistical Methods Statistics 651

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

6664/01 Edexcel GCE Core Mathematics C2 Bronze Level B3

Ch. 7: Estimates and Sample Sizes

SUMMARIZING MEASURED DATA. Gaia Maselli

Expanding brackets and factorising

Chapter 7, Part A Sampling and Sampling Distributions

MODULE 6 LECTURE NOTES 1 REVIEW OF PROBABILITY THEORY. Most water resources decision problems face the risk of uncertainty mainly because of the

Standard normal distribution. t-distribution, (df=5) t-distribution, (df=2) PDF created with pdffactory Pro trial version

ME EN 363 Elementary Instrumentation

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company

LABORATORY NUMBER 9 STATISTICAL ANALYSIS OF DATA

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

Data Analysis II. CU- Boulder CHEM-4181 Instrumental Analysis Laboratory. Prof. Jose-Luis Jimenez Spring 2007

Multiple samples: Modeling and ANOVA

Quadratic and Other Inequalities in One Variable

Chapter 6: Large Random Samples Sections

Exp. #1-1 : Measurement of the Characteristics of the Centripetal Force by Using Springs and a Computer Interface

4.1 Hypothesis Testing

Application Note AN37. Noise Histogram Analysis. by John Lis

EPAs New MDL Procedure What it Means, Why it Works, and How to Comply

Regression Analysis: Basic Concepts

Lecture 27: More on Rotational Kinematics

EE/CpE 345. Modeling and Simulation. Fall Class 5 September 30, 2002

Mark Scheme (Results) Summer Pearson Edexcel GCE in Statistics 3R (6691/01R)

AE2160 Introduction to Experimental Methods in Aerospace

Part 01 - Notes: Identifying Significant Figures

1. Review of Lecture level factors Homework A 2 3 experiment in 16 runs with no replicates

Design of Engineering Experiments Part 5 The 2 k Factorial Design

Classifier Evaluation. Learning Curve cleval testc. The Apparent Classification Error. Error Estimation by Test Set. Classifier

Notes on Mathematics Groups

Multiple Regression Examples

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Sampling (Statistics)

Harvard University. Rigorous Research in Engineering Education

Chapter 23. Inference About Means

Chapter 8 - Statistical intervals for a single sample

ECE 592 Topics in Data Science

Anomaly Detection. Jing Gao. SUNY Buffalo

Statistical inference provides methods for drawing conclusions about a population from sample data.

Transcription:

Unless we have made a very large number of measurements, we don't have an accurate estimate of the mean or standard deviation of a data set. If we assume the values are normally distributed, we can estimate the mean and standard deviation from the data. The sample mean and sample variance are given by: N = i (4.4a) N Ecel: AVERAGE(values) i= S ( ) (4.4b) N 2 2 i N i Variance 2 S S sample standard deviation Ecel: STDEV(values) How close are these values to the true mean and standard deviation? That depends on how many samples we have. FiniteStatistics.doc 9/26/2008 9:37 AM Page

p() 4.4 Finite Statistics Normal Distribution Function 0.9 0.8 Sigma = 0.5 Sigma =.0 Sigma = 2.0 0.7 Sigma = 3.0 0.6 0.5 0.4 0.3 0.2 0. 0-0 -8-6 -4-2 0 2 4 6 8 0 X FiniteStatistics.doc 9/26/2008 9:37 AM Page 2

For a normally distributed data set, we can say that the probability of a sample, i, differing from the data set mean value,, is given by i t, PS (P%) (4.5) t,p is referred to as the t estimator. Eample: http://www.eng.buffalo.edu/courses/mae334/notes/finitestatseample.ls TABLE 4. Sample of Variable i i Find the sample mean, standard deviation, 95% precision interval within which one should epect any measured value to fall, standard deviation of the means, and the 95% estimate of the true mean value. 0.98 2.07 3 0.86 4.6 5 0.96 6 0.68 Mean:.02 7.34 Standard Deviation: 0.6 8.04 t 9,95% : 2.093 9.2 95% Precision Interval: 0.330 0 0.86 Standard Deviation of the means: 0.035.02 95% Precision interval about the true mean value: 2.26 3.08 4.02 5 0.94 6. 7 0.99 8 0.78 9.06 20 0.96 0.074 FiniteStatistics.doc 9/26/2008 9:37 AM Page 3

Table 4.4 Student-t Distribution t 50 t 90 t 95 t 99.000 6.34 2.706 63.657 2 0.86 2.920 4.303 9.925 3 0.765 2.353 3.82 5.84 4 0.74 2.32 2.776 4.604 5 0.727 2.05 2.57 4.032 6 0.78.943 2.447 3.707 7 0.7.895 2.365 3.499 8 0.706.860 2.306 3.355 9 0.703.833 2.262 3.250 0 0.700.82 2.228 3.69 0.697.796 2.20 3.06 2 0.695.782 2.79 3.055 3 0.694.77 2.60 3.02 4 0.692.76 2.45 2.977 5 0.69.753 2.3 2.947 6 0.690.746 2.20 2.92 7 0.689.740 2.0 2.898 8 0.688.734 2.0 2.878 9 0.688.729 2.093 2.86 20 0.687.725 2.086 2.845 2 0.686.72 2.080 2.83 30 0.683.697 2.042 2.750 40 0.68.684 2.02 2.704 50 0.679.676 2.009 2.678 60 0.679.67 2.000 2.660 0.674.645.960 2.576 FiniteStatistics.doc 9/26/2008 9:37 AM Page 4

Standard Deviation of the Means If we take a set of N measurements of the same variable, then repeat this process M times, the mean of each data set will differ somewhat from the others. It can be shown that the mean values themselves will follow a normal distribution even if the original distribution is not normal. The standard deviation of the means is given by: S N (4.6) Notice that the standard deviation of the mean decreases as the sample size increases. We can now say with a certainty of P% that the mean of a S / 2 FiniteStatistics.doc 9/26/2008 9:37 AM Page 5

sample of N values differs from the true mean of the distribution by an amount = t,p S (P%) (4.7) PROBLEM 4.6 Consider a process in which the applied measured load has a known true mean of 00 N and a variance of 400 N 2. An engineer takes 6 measurements at random. What is the probability that this sample will have a mean value between 90 and 0? KNOWN: ' = 00 N 2 =400 N 2 (so, = 20 N) FIND: For N = 6, P(90 0)? FiniteStatistics.doc 9/26/2008 9:37 AM Page 6

SOLUTION Begin by finding the z value for a corresponding z / N For = 90 N, z 90 00 20/ 6 = -2.0 For = 0 N, z 0 00 20 / 6 = 2.0 So, P(90 0) P( 2.0 z 2.0) = 0.9544 So, there is about a 95% chance. FiniteStatistics.doc 9/26/2008 9:37 AM Page 7

Table 4.3 Probability Values for Normal Error Function One-Sided Integral Solution for p( z ) (2 ) / 2 z 0 e 2 /2d z 0 0.0 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0 0.0000 0.0040 0.0080 0.020 0.060 0.099 0.0239 0.0279 0.039 0.0359 0. 0.0398 0.0438 0.0478 0.057 0.0557 0.0596 0.0636 0.0675 0.074 0.0753 0.2 0.0793 0.0832 0.087 0.090 0.0948 0.0987 0.026 0.064 0.03 0.4 0.3 0.79 0.27 0.255 0.293 0.33 0.368 0.406 0.443 0.480 0.57 0.4 0.554 0.59 0.628 0.664 0.700 0.736 0.772 0.808 0.844 0.879 0.5 0.95 0.950 0.985 0.209 0.2054 0.2088 0.223 0.257 0.290 0.2224 0.6 0.2257 0.229 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.257 0.2549 0.7 0.2580 0.26 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.8 0.288 0.290 0.2939 0.2967 0.2995 0.3023 0.305 0.3078 0.306 0.333 0.9 0.359 0.386 0.322 0.3238 0.3264 0.3289 0.335 0.3340 0.3365 0.3389 0.343 0.3438 0.346 0.3485 0.3508 0.353 0.3554 0.3577 0.3599 0.362. 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.380 0.3830.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.405.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.45 0.43 0.447 0.462 0.477.4 0.492 0.4207 0.4222 0.4236 0.425 0.4265 0.4279 0.4292 0.4306 0.439.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.448 0.4429 0.444.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.455 0.4525 0.4535 0.4545.7 0.4554 0.4564 0.4573 0.4582 0.459 0.4599 0.4608 0.466 0.4625 0.4633.8 0.464 0.4649 0.4656 0.4664 0.467 0.4678 0.4686 0.4693 0.4699 0.4706.9 0.473 0.479 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.476 0.4767 2 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.482 0.487 2. 0.482 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 2.2 0.486 0.4864 0.4868 0.487 0.4875 0.4878 0.488 0.4884 0.4887 0.4890 2.3 0.4893 0.4896 0.4898 0.490 0.4904 0.4906 0.4909 0.49 0.493 0.496 2.4 0.498 0.4920 0.4922 0.4925 0.4927 0.4929 0.493 0.4932 0.4934 0.4936 2.5 0.4938 0.4940 0.494 0.4943 0.4945 0.4946 0.4948 0.4949 0.495 0.4952 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.496 0.4962 0.4963 0.4964 2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.497 0.4972 0.4973 0.4974 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.498 2.9 0.498 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 3 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 FiniteStatistics.doc 9/26/2008 9:37 AM Page 8

4.7 Data Outlier Detection How do you handle spurious data points? The most common and simplest approach is to label points that lie outside the range of 99.8% probability of occurrence,, as outliers. This three-sigma test works well with data set of 0 or more points. Eample 4. Given 0 data points with a mean of 27 psi and a standard deviation of 3.8 test for data outliers. Three-sigma test for small data sets gives a range of 27 ± t 9,99.8% S = 27 ± 4.3*3.8,.25 < > 42.75. There are no data points outside this range. For large data sets, the modified three-sigma test for outliers can be used. A modified z variable is computed with the data set mean and standard deviation. z 0 for each data point is calculated and the corresponding probability value for the Normal Error Function is found. If the probability value is less than 0.% then the data point is considered an outlier. There is one data point ( i = 8) that is outside this range. 4.8 Number of Measurements Required Some sample statistics must be known to estimate the variation in the data set and therefore estimate a confidence interval in the data yet to be acquired. = t S,P FiniteStatistics.doc 9/26/2008 9:37 AM Page 9

The 95% confidence interval is therefore S CI t, 95% S t, 95% N (95%) The one-sided precision value d is d CI 2 Therefore if follows that N t, 95% S t, 95% d (95%) Problem 4.4 Estimate the number of measurements of a time-dependent acceleration signal obtained from a vibrating vehicle that would lead to an acceptable confidence interval about the mean of 0. g, if the standard deviation of the signal is epected to be 2 g. 2 S N KNOWN: CI = 0. g S = 2 g FIND: N SOLUTION Let d = CI/2 = 0.05 g. We are looking for the number of measurements required to keep ts 0.05g at 95%. N = ( ts / d) 2 If we select a large number of measurements, such that t N,95 =.96, then N 650 For this value, the t value remains unchanged. Thus, a large number of measurements are required due to the close restriction on CI. FiniteStatistics.doc 9/26/2008 9:37 AM Page 0

FiniteStatistics.doc 9/26/2008 9:37 AM Page