-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).

Similar documents
Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information.

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Statistics: Error (Chpt. 5)

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

SCHOOL OF MATHEMATICS AND STATISTICS

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Chapter 5. Errors in Chemical Analysis 熊同銘.

How to Describe Accuracy

Topic 2 Measurement and Calculations in Chemistry

Econometrics. 4) Statistical inference

Lectures 5 & 6: Hypothesis Testing

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Review of Statistics 101

Statistical Analysis of Chemical Data Chapter 4

EPAs New MDL Procedure What it Means, Why it Works, and How to Comply

Practical Statistics for the Analytical Scientist Table of Contents

Quality Assurance is what we do to get the right answer for our purpose FITNESS FOR PURPOSE

Chapter 2: Statistical Methods. 4. Total Measurement System and Errors. 2. Characterizing statistical distribution. 3. Interpretation of Results

Descriptive Statistics

Statistical Analysis of Engineering Data The Bare Bones Edition. Precision, Bias, Accuracy, Measures of Precision, Propagation of Error

Control Charts Confidence Interval Student s t test. Quality Control Charts. Confidence Interval. How do we verify accuracy?

1] What is the 95% confidence interval for 6 measurements whose average is 4.22 and with a standard deviation of 0.07?

Why is the field of statistics still an active one?

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Uncertainty of Measurement (Analytical) Maré Linsky 14 October 2015

Chap The McGraw-Hill Companies, Inc. All rights reserved.

ADVANCED ANALYTICAL LAB TECH (Lecture) CHM

4.1 Hypothesis Testing

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Simple and Multiple Linear Regression

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Final Exam. Name: Solution:

Chapter 8: Sampling, Standardization, and Calibration

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Ch. 1: Data and Distributions

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Taguchi Method and Robust Design: Tutorial and Guideline

WISE International Masters

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Schedule. Draft Section of Lab Report Monday 6pm (Jan 27) Summary of Paper 2 Monday 2pm (Feb 3)

Review of Multiple Regression

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Analytical Performance & Method. Validation

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Chapter 23: Inferences About Means

Let the x-axis have the following intervals:

Econometrics Midterm Examination Answers

Business Statistics. Lecture 10: Course Review

Lecture 3. - all digits that are certain plus one which contains some uncertainty are said to be significant figures

Ch Inference for Linear Regression

Source: Chapter 5: Errors in Chemical Analyses

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Simple Linear Regression for the Climate Data

Inferences for Regression

Regression Analysis for Data Containing Outliers and High Leverage Points

MassLynx 4.1 Peak Integration and Quantitation Algorithm Guide

ST430 Exam 2 Solutions

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Homework 2: Simple Linear Regression

OF ANALYSIS FOR DETERMINATION OF PESTICIDES RESIDUES IN FOOD (CX/PR 15/47/10) European Union Competence European Union Vote

Difference in two or more average scores in different groups

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

Evaluating Hypotheses

Performance of Deming regression analysis in case of misspecified analytical error ratio in method comparison studies

Formal Statement of Simple Linear Regression Model

Model Specification and Data Problems. Part VIII

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Data analysis and Geostatistics - lecture VII

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Experiment 1 (Part A): Plotting the Absorption Spectrum of Iron (II) Complex with 1,10- Phenanthroline

Advanced Experimental Design

Correlation and Regression

Calibration (The Good Curve) Greg Hudson EnviroCompliance Labs, Inc.

STATISTICS SYLLABUS UNIT I

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

14 Multiple Linear Regression

Density Temp vs Ratio. temp

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Final Exam - Solutions

Analysis of variance

1/11/2011. Chapter 4: Variability. Overview

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

Name SUMMARY/QUESTIONS TO ASK IN CLASS AP STATISTICS CHAPTER 1: NOTES CUES. 1. What is the difference between descriptive and inferential statistics?

How s that *new* LOD Coming? Rick Mealy WWOA Board of Directors DNR LabCert

Confidence Intervals, Testing and ANOVA Summary

Transcription:

Chemometrics Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information. -However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics). -Two lines of development: experimental design: planning and performing experiments in a way that the resulting data contains the maximum information about stated questions. multivariate data analysis: utilizing all available data in the best possible way. 1

Chemometrics 1. Errors in Quantitative Analysis & Descriptive Statistics. Signal Processing & Time-Series Analysis 3. Experimental Design & Optimization 4. Factorial Designs & Analysis 5. Fractional Factorial Designs & Analysis 6. Univariate Calibration & Least Squares 7. Linear vs- Multivariate Regression 8. Principal Component Regression 9. Quality Assurance & Good Laboratory Practice

Errors in Quantitative Analysis Student A Results have two characteristics: 1. All very close to each other (10.08-10.1 ml).. All too high (10.00 ml exactly). Two types of errors have occurred: 1. Random cause replicate results to differ from one another results fall on both sides of the mean (10.10 ml in student s A case). effect the precision or reproducibility of the experiment. Student A small random errors (precise).. Systematic cause all the results to be in error in the same sense (High). Total systematic error is termed bias. Hence: Student A has precise, but biased results. 3

Errors in Quantitative Analysis No analysis is free of error! Types of errors 1. Gross readily described, errors that are obvious. Instrument breakdown, dropping a sample, contamination (gross).. Random 3. Systematic Student A B C D 10.08 9.88 10.19 10.04 10.11 10.14 9.79 9.18 Results (ml) 10.09 10.0 9.69 10.0 10.10 9.80 10.05 9.97 10.1 10.1 9.78 10.04 Comments Precise, biased Imprecise, unbiased Imprecise, biased Precise, unbiased Titration each student performs an analysis in which exactly 10.00 ml of exactly 0.1 M NaOH is titrated with exactly 0.1M HCl. 4

Errors in Quantitative Analysis Accuracy How far a result is from the true value. Precision How close multiple determinations are to each other. 5

1. Basic Statistics Descriptive Spectrophotometric measurement (Abs) of a sample solution from 15 replicate measurements. Measurement Value Measurement Value 1 0.3410 9 0.3430 0.3350 10 0.340 3 0.3470 11 0.3560 4 0.3590 1 0.3500 5 0.3530 13 0.3630 6 0.3460 14 0.3530 7 0.3470 15 0.3480 8 0.3460 6

Descriptive statistics for the spectrophotometric measurements. Parameter Value Sample #, n Mean Median Std Dev RSD % Std error Max value Min value 15 0.3486 0.347 0.00731.096 0.00189 0.363 0.335 Statistical Tests Student s t-test, F-test, tests for outliers Distributions Gaussian, Poisson, binominal 7

1. Statistics of Repeated Measurements Student A Mean x = x n Standard deviation i s = ( x x) /( n 1) x i ( x i x) ( x i x) 10.08-0.0 0.0004 10.11 0.01 0.0001 10.09-0.01 0.0001 10.10 0.00 0.0000 10.1 0.0 0.0010 Total 50.50 0 0.0010 8

xi 50.50 x = = = 10. 1mL n 5 s = ( x x) /( n 1) = 0.001/ 4 = 0. 0158mL 9

. Distribution of Repeated measurements 1. Gaussian distribution Bell-shaped curve for the frequency of the measurements Fig. 1 Histogram for the measurements of spectrophotometric data. Theoretical distribution with the Gaussian curve in the solid line. 10

A. Gaussian Distribution 11

3. Significance Tests in Analytical Measurements Testing the truth of the hypothesis (null hypothesis, H o ) Null = implies that no difference exists between the observed and known values x Assuming H o is true, stats can be used to calculate the probabilty that the difference between and true value, µ, arises solely as a result of random errors. Is the difference significant? x µ t = n one variable t test s where s = estimate of the standard deviation n = number of parallel measurements (student's t ) 1

3. Statistical tests A. Student s t If t exceeds a certain critical value, then the H o is rejected, 13

3. Statistical tests A. Student s t Ex: The following results were obtained in the determination of Fe 3+ in water samples. 504 ppm 50.7 ppm 49.1 ppm 49.0 ppm 51.1 ppm Is there any evidence of systematic error? x = 50.06 s = 0.956 H o no systematic error, i.e., µ = 50 and using equations Table: critical value is t 4 =.78 (p=0.05). NO significant difference since the observed t is less than the critical value (H o retained) 14

3. Statistical tests B. Two-sided t-test A comparison of two sample means: x t = 1 x n1n sd n1 + n x1 x where : n 1, n = # of parallel determinations for s d = weighted averaged standard deviation: x1 and x s d = ( n 1 1) s1 + ( n 1) s n + n 1 Note: The null hypothesis is accepted if x1 and x are different only randomly at risk level P, i.e. if the calculated t-value is lower than the tabulated value for t. 15

3. Statistical test C. F- test Used to compare the standard deviations of two random samples s1 F= (wheres1 > s s Note : The null hypothesis is accepted if differ randomly, i.e., if the calculated F-value is lower than F-distribution from the Table with v 1 + v degrees of freedom. ) s 1 and s 16

3. Statistical test F- test 17

3. Statistical test Ex. Determination of titanium content (absolute %) by two laboratories. Lab 1 Lab 0.470 0.59 0.448 0.490 0.463 0.489 0.449 0.51 0.48 0.486 0.454 0.50 0.477 0.409 18

Compare the standard deviations between the two laboratories: s 0.09 F = 1 = = 1.58 Critical Valuefrom the table 6.85 0.018 = s Calculated value is lower than the tabulated, hence the test result is not significant (only random differences). 19

3. Significance Tests D. Dixon s Q-test Test for the determination of outliers in data sets Q x x x and Q xn xn = x x 1 1 = n n x1 n 1 The H o, i.e., that no outlier exists, is accepted if the quantity Q<Q(1-P;n) 0

Table: Critical values for the Q-test at H o at the 1% risk level n Q(0.99; n) 3 0.99 4 0.89 5 0.76 6 0.70 7 0.64 8 0.59 9 0.56 10 0.53 11 0.50 1 0.48 13 0.47 14 0.45 15 0.44 0 0.39 1

3. Significance Tests Ex: Trace analysis of PAH s in soil reveals benzo [a] pyrene in the following amounts (mg kg -1 ) 5.30 5.00 5.10 5.0 5.10 6.0 5.15 Use the Q test to determine if the largest & smallest are outliers Q 5.10 5.00 = 0.083 6.0 5.00 = Qn 6.0 5.30 = 6.0 5.00 0.75 Critical values from the Table, Q(1-P = 0.99; n = 7) = 0.64 For the smallest, Q 1 <Q calc, thus the value cannot be an outlier For the largest, Q 1 >Q calc, then the outlier exists Why is this determination important?

3. Significance Tests E. Analysis of variance (ANOVA) Separate and estimate the causes of variation, more than one source of random error. Example? Note: There are numerous chemometric tools which can be used with ANOVA. We will cover them in this course! 3

E. ANOVA Within sample variation Conditions A. Freshly prepared B. Stored for 1 hr in the dark C. Stored for 1 hr in the subdued light D. Stored for 1 hr in bright light Replicate Measurements 10, 100, 101 101, 101, 104 97, 95, 99 90, 9, 94 Mean 101 10 97 97 ( x For each sample, the variance can be calculated: i ( n x) 1) 4

E. ANOVA Sample A = (10 101) + (100 101) 3 1 + (101 101) = 1 Sample B = (101 10) + (101 10) 3 1 + (104 10) = 3 C + D = 4 Sample Mean variance (101 98) + (10 98) + (97 98) 3 1 + (9 98) = 6 / 3 Note: This estimate has 3 degrees of freedom since it is calculated from 4 samples 5

E. ANOVA Summarizing our calculations: within sample mean square = 3 with 8 d.f. between sample mean sample = 6 with 3 d.f. F = 6/3 = 0.7 From Table the critical value of F= 4.066 (P=0.05) - since the calculated value of F is greater than this, the null hypothesis is rejected: the sample means DO differ significantly. 6

E. The Arithmetic of ANOVA calculations Source of Variation Sum squares d.f. Between sample Ti T h-1 within sample by subtraction by subtraction xij Total i j N N-1 where N = nh = total # of measurements Ti = sum of the measurements in the i th sample T = sum of all the measurements - The test statistic is F = between sample mean square / within sample mean square and the critical value is F h-t, N-h i i n N T 7

Test whether the samples in previous table were drawn from population, with equal means. (All values have had 100 subtracted from them) A B C D 1-3 -10 0 1-5 -8 n=3, h=4, N=1, xij = 58 i i j L 1 4-1 -6 Ti 3 6-9 -4 Ti 9 36 81 576 T i = 70 i 8

Source of Variation Sum of Squares d.f. Mean Square Between sample 70/3-(-4)/1=186 3 186/3 = 6 Within sample By subtraction =4 8 4/8=3 Total 58-(-4)/1 = 10 11 F= 6/3 = 0.7 Significant difference 9

, Errors and Calibration Determining Concentration from Calibration Curve Basic steps: (1) Make a series of dilutions of known concentration for the analyte. () Analyze the known samples and record the results. (3) Determine if the data is linear. (4) Draw a line through the data and determine the line's slope and intercept. (5) Test the unknown sample in duplicate or triplicate. Use the line equation to determine the concentration of the analyte: y = mx + b Conc analyte = reading intercept slope 30

Calibration Cont Limit of Detection (LOD) - lowest amount of analyte in a sample which can be detected but not necessarily quantitated as an exact value. - mean of the blank sample plus or 3 times the SD obtained on the blank sample (i.e., LOD = mean blk + Zs blk ) LOD calculation - alternative Data required: (1) calibration sensitivity = slope of line through the signals of the concentration standards including blank solution () standard deviation for the analytical signal given by the blank solution LOD = 3x SD blank signals slope of signals for std's 31

Outliers Treatment of Outliers 1. Re-examine for Gross Errors. Estimate Precision to be Expected 3. Repeat Analysis if Time and Sufficient Sample is Available 4. If Analysis can not be Repeated, Perform a Q-Test 5. If Q-Test Indicates Retention of Value, Consider Reporting the Median 3