Statistical Analysis of Chemical Data Chapter 4

Similar documents
Statistics: Error (Chpt. 5)

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Topic 2 Measurement and Calculations in Chemistry

Ch18 links / ch18 pdf links Ch18 image t-dist table

Business Statistics. Lecture 10: Course Review

Lecture 3. - all digits that are certain plus one which contains some uncertainty are said to be significant figures

4.1 Hypothesis Testing

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

How to Describe Accuracy

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 23: Inferences About Means

Single Sample Means. SOCY601 Alan Neustadtl

STA Module 10 Comparing Two Proportions

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

CBA4 is live in practice mode this week exam mode from Saturday!

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

The Normal Distribution. Chapter 6

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Background to Statistics

How do we compare the relative performance among competing models?

Sampling Distributions: Central Limit Theorem

ANALYTICAL CHEMISTRY - CLUTCH 1E CH STATISTICS, QUALITY ASSURANCE AND CALIBRATION METHODS

Sign test. Josemari Sarasola - Gizapedia. Statistics for Business. Josemari Sarasola - Gizapedia Sign test 1 / 13

11: Comparing Group Variances. Review of Variance

Median Statistics Analysis of Non- Gaussian Astrophysical and Cosmological Data Compilations

Probability and Statistics

y n 1 ( x i x )( y y i n 1 i y 2

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Tables Table A Table B Table C Table D Table E 675

appstats27.notebook April 06, 2017

MICROPIPETTE CALIBRATIONS

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Physics 509: Bootstrap and Robust Parameter Estimation

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Statistical inference provides methods for drawing conclusions about a population from sample data.

Math Review Sheet, Fall 2008

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Introduction to Design of Experiments

STAT Chapter 8: Hypothesis Tests

4.12 Sampling Distributions 183

HYPOTHESIS TESTING. Hypothesis Testing

Chapter 7 Comparison of two independent samples

Harris: Quantitative Chemical Analysis, Eight Edition CHAPTER 03: EXPERIMENTAL ERROR

Chapter 1 Statistical Inference

Review of Statistics 101

Chapter 5 Confidence Intervals

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Harris: Quantitative Chemical Analysis, Eight Edition CHAPTER 03: EXPERIMENTAL ERROR

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

GAISE Framework 3. Formulate Question Collect Data Analyze Data Interpret Results

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing

The Purpose of Hypothesis Testing

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Inferential Statistics

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Statistics 4. Experimental measurements always contain some variability, so no conclusion can be. Is My Red Blood Cell Count High Today?

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Sampling, Confidence Interval and Hypothesis Testing

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics

Vocabulary: Samples and Populations

Density Temp vs Ratio. temp

Inference for the Regression Coefficient

Hypothesis testing: Steps

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Final Exam - Solutions

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Survey on Population Mean

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

UCLA STAT 10 Statistical Reasoning - Midterm Review Solutions Observational Studies, Designed Experiments & Surveys

Error Analysis, Statistics and Graphing Workshop

Descriptive Statistics

Stat 427/527: Advanced Data Analysis I

Chapter 27 Summary Inferences for Regression

Two Sample Hypothesis Tests

Advanced Experimental Design

(Re)introduction to statistics: dusting off the cobwebs

Lecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

CENTRAL LIMIT THEOREM (CLT)

STAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Homework Assignment - Chapter 4 - Fall 2011

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Warm-up Using the given data Create a scatterplot Find the regression line

The Difference in Proportions Test

An inferential procedure to use sample data to understand a population Procedures

Chapter 7 Sampling Distributions

Chapter 23. Inference About Means

Introduction to Statistics and Data Analysis

Midterm 2 - Solutions

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Sociology 6Z03 Review I

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Transcription:

Statistical Analysis of Chemical Data Chapter 4

Random errors arise from limitations on our ability to make physical measurements and on natural fluctuations

Random errors arise from limitations on our ability to make physical measurements and on natural fluctuations

Histogram: (Bar Graph) Normal Curve: (Line Graph) 30 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9

Data that vary because of random errors only will be normally distributed around a mean value. The distribution of random data around the mean is characterized by a Gaussian Distribution. Characteristics: Bell-shaped Center: Mean = Median = Mode Standard Deviation width of the distribution

SAMPLE vs. POPULATION Population is a set of entities concerning which statistical inferences are to be drawn. Sample is the subset of a manageable size of population. Statistics calculated from the sample are used to infer or extrapolate about the population. Population Sample

SAMPLE vs. POPULATION Population Mean ( ) - mean of entire population Sample Mean (x) mean of a given sample 200 180 160 140 120 100 80 60 40 30 25 20 15 10 5 20 0 0 1 2 3 4 5 6 7 8 9-3 -2.5-2 -1.5-1 -0.5 0 0.5 1 1.5 2 2.5 When N (usually 20-30) is big x When N is small there is a bigger deviation between x and

SAMPLE vs. POPULATION Population Standard Deviation ( ) measures the width of distribution of a population Sample Standard Deviation (s) applicable to finite samples When N (usually 20-30) is big s When N is small there is a bigger deviation between s and

SAMPLE vs. POPULATION

Standard Deviation and Probability

Confidence Interval Confidence Interval (CI) is a range of values within which there is a specified probability of finding the true mean If you only take a single measurement in a population then that single measurement will have a confidence interval of: If you take a lot of measurements, the mean of all the measurements will have a confidence interval of: µ = x ± zσ µ = x ± zσ n **NOTE: This is for cases where there is a good estimate of the population standard deviation (s ) or it is known.

Confidence Intervals

Confidence Intervals There is an incorrect notion of confidence interval: Given the true value and a specified confidence interval, the measurements will fall within in this interval at a certain probability The correct notion is that, Given the sample/population mean and a specified confidence interval, the true mean will fall in this confidence interval at a certain probability

Confidence Intervals Student s t is a statistical tool used to express confidence intervals We use t when we don t know the population standard deviation, the confidence interval can be estimated as:

Confidence Intervals Student s t is a statistical tool used to express confidence intervals We use t when we don t know the population standard deviation, the confidence interval can be estimated as: µ = x ± ts n

Hypothesis testing employs Student s t statistics Student s t can be used to compare two sets of measurements to decided whether they are the same or different CASE 1: Comparing measured value to theoretical value CASE 2: Comparing replicate sets of measurements (with different means and standard deviations) CASE 3: Comparing paired data

Hypothesis testing employs Student s t statistics CASE 1: Comparing measured value to theoretical value We measure a quantity several times, obtaining an average value and a standard deviation. We need to compare our answer with a known, accepted answer. The average does not agree exactly with the accepted answer. Does our measured answer agree or disagree with the known value within experimental error? Null Hypothesis (H 0 ): x = 0 Use the t-statistic: Alternative Hypothesis (H a ): x 0 (two-tailed) if t calc = x µ s n t calc t table or t calc t table x < 0 (one-tailed) if x > 0 (one-tailed) if t calc t table t calc t table

Hypothesis testing employs Student s t statistics CASE 1: Comparing measured value to theoretical value EXAMPLE 1. A new procedure for the rapid determination of the percentage of sulfur in kerosene was tested on a sample known from its method of preparation to contain 0.123% Sulfur. The results were % S= 0.112, 0.118, 0.115 and 0.119. Do the data indicate that there is a bias in the method at the 95% confidence interval?

Hypothesis testing employs Student s t statistics CASE 1: Comparing measured value to theoretical value EXAMPLE 2. Sewage and industrial pollutants dumped into a body of water can reduce the dissolved oxygen concentration and adversely affect aquatic species. In one study, weekly readings are taken from the same location in a river over a 2-month period (see table). Some scientists think that 5.0 ppm is a dissolved O 2 level that is marginal for fish to live. Conduct a statistical test to determine whether the mean dissolved O 2 concentration is less than 5.0 ppm at 95% confidence level. Week Dissolved O 2, ppm 1 4.9 2 5.1 3 5.6 4 4.3 5 4.7 6 4.9 7 4.5 8 5.1

Hypothesis testing employs Student s t statistics CASE 2: Comparing replicate measurements We measure a quantity multiple times by two different Methods that give two different answers, each with its own standard deviation. Do the two results agree with each other within experimental error, or do they disagree? Null Hypothesis (H 0 ): x 1 = x 2 Use the t-statistic: t calc = x 1 x 2 s pooled n 1 n 2 n 1 + n 2 Alternative Hypothesis (H a ): 0 if t calc s pooled = s 2 1 n 1 1 t table ( ) + s 2 ( 2 n 2 1) n 1 + n 2 2

Hypothesis testing employs Student s t statistics CASE 2: Comparing replicate measurements We measure a quantity multiple times by two different Methods that give two different answers, each with its own standard deviation. Do the two results agree with each other within experimental error, or do they disagree?

Hypothesis testing employs Student s t statistics CASE 2: Comparing replicate measurement EXAMPLE 3. Lord Rayleigh measured the mass of dry air (O 2 -free) and chemically generated N 2 of the same volume. Is dry air the same as chemically generated N 2? From air (g) 2.31017 2.30143 2.30986 2.29890 2.31010 2.29816 2.31001 2.30182 2.31024 2.29869 2.31010 2.29940 2.31028 2.29849 Average From chemical composition (g) 2.29889 2.31011 2.29947 Standard Deviation 0.000143 0.00138

Hypothesis testing employs Student s t statistics CASE 2: Comparing replicate measurement EXAMPLE 3. Lord Rayleigh measured the mass of dry air (O 2 -free) and chemically generated N 2 of the same volume. Is dry air the same as chemically generated N 2?

Hypothesis testing employs Student s t statistics CASE 2: Comparing replicate measurement EXAMPLE 4. A reliable assay of ATP in a certain type of cell gives a value of 111.0 mol/100 ml, with a standard deviation of 2.8 in four replicate measurements. You have developed a new assay which gave the following values in replicate analyses: 117, 119. 111, 115, 120 mol/100 ml a). Find the mean and standard deviation of your new analysis b). Can you be 95% confident that your method produces a result different from the reliable value?

Hypothesis testing employs Student s t statistics CASE 3: Comparing paired data Sample 1 is measured once by Method A and once by Method B, which do not give exactly the same result. Then a different sample, designated as sample 2, is also measured once by Method A and once by Method B; and again the results are not exactly equal. The procedure is repeated for n different samples. Do the two methods agree with each other within experimental error, or is one systematically different from the other? Null Hypothesis (H 0 ): d = 0 often times 0 = 0 Use the t-statistic: Alternative Hypothesis (H a ): 0 (two-tailed) if t calc = d 0 s d n t calc t table or t calc t table

Hypothesis testing employs Student s t statistics CASE 3: Comparing paired data EXAMPLE 5. A new automated procedure for determining glucose in serum (Method A) is to be compared with an established method (Method B). Both methods are performed on the serum from six patients to eliminate patient-to-patient variability. Do the following results confirm a difference in the two methods at 95% confidence level? 1 2 3 4 5 6 Method A, mg/l 1044 720 845 800 957 650 Method B, mg/l 1028 711 820 795 935 639 Difference, mg/l 16 9 25 5 22 11

Hypothesis testing employs Student s t statistics CASE 3: Comparing paired data EXAMPLE 6. Two different analytical methods were used to determine residual chlorine in sewage effluents. Both methods were used on the same samples, but each sample came from various locations, with differing amounts of contact time. The concentration of Cl in mg/l are given in the Table. Do the two methods give different results for 90%, 95% and 99% confidence levels? Sample Method A Method B 1 0.39 0.36 2 0.84 1.35 3 1.76 2.56 4 3.35 3.92 5 4.69 5.35 6 7.70 8.33 7 10.52 10.70 8 10.92 10.91

Dealing with BAD DATA Bad data are due to GROSS ERRORS, and result in outliers. We use Q test to determine whether we can reject or we need to retain an outlier. Q = x (questionable data ) x nearest neighbor spread Q calc > Q table Discard data EXAMPLE 7. The analysis of calcite sample yielded % CaO of 55.95, 56.00, 56.04, 56.08 and 56.23. The last value appears anomalous; should it be retained or discarded at 95% confidence level?