Comparing your lab results with the others by one-way ANOVA

Similar documents
Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Sample Size Determination (Two or More Samples)

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Chapter 13, Part A Analysis of Variance and Experimental Design

Understanding Samples

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Common Large/Small Sample Tests 1/55

To make comparisons for two populations, consider whether the samples are independent or dependent.

Regression, Inference, and Model Building

Module 1 Fundamentals in statistics

LESSON 20: HYPOTHESIS TESTING

Chapter 6 Sampling Distributions

University of California, Los Angeles Department of Statistics. Hypothesis testing

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Data Analysis and Statistical Methods Statistics 651

Statistics 20: Final Exam Solutions Summer Session 2007

Mathematical Notation Math Introduction to Applied Statistics

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Lesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67

Statistics. Chapter 10 Two-Sample Tests. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall. Chap 10-1

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Stat 139 Homework 7 Solutions, Fall 2015

Properties and Hypothesis Testing

MA238 Assignment 4 Solutions (part a)

COMPARISONS INVOLVING TWO SAMPLE MEANS. Two-tail tests have these types of hypotheses: H A : 1 2

Analysis of Experimental Data

Chapter 5: Hypothesis testing

1 Inferential Methods for Correlation and Regression Analysis

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Power and Type II Error

Math 140 Introductory Statistics

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Lecture 9: Independent Groups & Repeated Measures t-test

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

Stat 200 -Testing Summary Page 1

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Analytic Theory of Probabilities

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

(all terms are scalars).the minimization is clearer in sum notation:

Topic 9: Sampling Distributions of Estimators

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

f(x)dx = 1 and f(x) 0 for all x.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

n but for a small sample of the population, the mean is defined as: n 2. For a lognormal distribution, the median equals the mean.

GG313 GEOLOGICAL DATA ANALYSIS

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Final Examination Solutions 17/6/2010

This is an introductory course in Analysis of Variance and Design of Experiments.

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

The standard deviation of the mean

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Chapter 12 Correlation

MATH/STAT 352: Lecture 15

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Chapter 23: Inferences About Means

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Frequentist Inference

Read through these prior to coming to the test and follow them when you take your test.

Sampling Distributions, Z-Tests, Power

Statistics 511 Additional Materials

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Data Analysis and Statistical Methods Statistics 651

Hypothesis Testing (2) Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

1036: Probability & Statistics

Notes on Hypothesis Testing, Type I and Type II Errors

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Homework 5 Solutions

(6) Fundamental Sampling Distribution and Data Discription

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Transcription:

Comparig your lab results with the others by oe-way ANOVA You may have developed a ew test method ad i your method validatio process you would like to check the method s ruggedess by coductig a simple cross-check program by comparig your ow test results agaist the results provided by the other collaboratig peer laboratories o a similar sample. What you may have doe is to sed them oe or more similar homogeeous samples, request them to follow your method closely ad report its repeatability data after their lab aalyses. Whe all the data are collated, you ca carry out a aalysis of variace (ANOVA) to see if all the data are sigificatly comparable or ot. I fact, ANOVA is a extremely powerful statistical techique which ca be used to separate ad estimate the differet causes of variatio. I this case, the causes of variatio are the laboratory factor ad the radom error i measuremet of each participatig laboratory. Sice there is oly oe cotrolled factor, the statistical approach is called oe-way ANOVA. Let s see a example to demostrate how oe-way ANOVA ca be carried out. The followig table show the results of 4 similarly prepared samples received by 4 differet laboratories i a simple cross-check programme orgaised for the measuremet of zic cotet i water i mg/l by the FAAS method: Trial No. Lab 1 Lab Lab 3 Lab 4 1 103 10 97.4 107 99 10 95.3 110 3 101 106 99.5 109 Mea, mg/l 101 103.3 97.4 108.7 Overall Mea : 10.6 mg/l The above table shows that the mea values for the 4 samples (from a same source) reported by the participatig laboratories were apparetly differet. But, were they really so? We kow that because of radom error, eve if the true value which we are tryig to measure, say 100 mg Z/L is uchaged, the sample mea may vary from oe laboratory to aother. 1

ANOVA therefore, tests whether the differece betwee the samples reported by the 4 laboratories is too great to be explaied by the radom error aloe. First of all, let s do a hypothesis testig as below: H o : all the samples tested by the laboratories were reportig comparable populatio mea results with a variace σ o, i.e. µ 1 = µ = µ 3 = µ 4 H 1 : all mea results were ot comparable, i.e. µ 1 µ µ 3 µ 4 O the basis of this hypothesis, the data variace σ o ca be estimated i ways: * oe ivolvig the variatio withi each laboratory, ad, * the other, the variatio betwee the laboratories. (i) Withi lab variatio Remember that variace is the square of stadard deviatio. Hece, upo calculatio, we have: Variace of Lab 1 = 4.00 Variace of Lab = 5.33 Variace of Lab 3 = 4.41 Variace of Lab 4 =.33 Averagig these values gives: Withi-sample estimate of σ o = (4.00+5.33+4.41+.33)/4 = 4.0 This estimate has 8 degrees of freedom, v [each sample estimate had (3-1) or degrees of freedom ad there were 4 laboratories]. (ii) Betwee-lab variatio Sice the samples aalyzed by the 4 laboratories were all draw from a populatio (oe source) which has populatio variace σ o, their average of all the samples aalyzed should come from this populatio with variace σ o ( where square of σ o is the variace of the populatio). (Recall the well kow Cetral Limit Theorem equatio : σ µ = x± z ) Thus, if the ull hypothesis H o is true, the variace of the meas of the samples aalyzed by these laboratories gives a estimate of σ o /.

I this example, we have : Lab 1 101.0 Lab 103.3 Lab 3 97.4 Lab 4 108.7 Mea of Labs, mg/l 10.6 Number of labs () 4 Degree of freedom 3 Std deviatio (SD) 4.74 Lab mea variace = (SD) squared.314 That meas : σ o / 3 =.314 (where = 3 measuremets for each lab) Therefore, the betwee-sample (laboratory) estimate of σ o (mea square) =.314 x 3 = 66.94 Note : this estimate of σ o does ot deped o the variability withi each laboratory, because it is calculated from the lab meas. I this case, the umber of degrees of freedom v is (4 1) or 3 ad the mea square is 66.94 ad so the sum of squared terms is 3 x 66.94 = 00.8 Now, let s summarize our calculatios so far: Withi-lab mea square = 4.0 with df v = 8 Betwee-lab mea square = 66.94 with df v = 3 If our H o is correct, these estimates of σ o should ot differ sigificatly, or else, for H 1, the betwee-lab estimate of σ o will be greater tha the withi-lab estimate because of betweelab variatio. To test if it is sigificatly greater, a oe-tailed F-test is used: F 3,8 = 66.94 / 4.0 = 16.7 From the F-table, we fid that the critical value at F 3,8 is 4.07 with 95% cofidece. Sice the calculated value of F is greater tha this critical value, the ull hypothesis Ho is rejected, i.e. the laboratory meas do differ sigificatly. I other words, the proposed aalytical method may ot be rugged eough ad does ot seem to be reproducible by all the participatig laboratories. 3

If we were to use the Data Aalysis Tool of Microsoft Excel spreadsheet uder the ANOVA Sigle Factor to aalyse the above data, we should get the followig results which are exactly the same as described above based o the first priciple: ANOVA: Sigle Factor SUMMARY Groups Cout Sum Average Variace Lab 1 3 303.000 101.000 4.000 Lab 3 310.000 103.333 5.333 Lab 3 3 9.00 97.400 4.410 Lab 4 3 36.000 108.667.333 ANOVA Source of Variatio SS df MS F P-value F crit Betwee Labs 00.87 3 66.94 16.656 0.00084 4.07 Withi Labs 3.153 8 4.019 Total 3.98 11 Aother questio: which lab or labs were the root cause for o-comparable results? Were they all sigificatly differet amogst themselves? To aswer this, we ca rearrage the meas of the 4 labs i icreasig order ad compare the differece betwee adjacet values with a quatity called the least sigificat differece (LSD), which has the followig expressio: LSD t s = h( 1) where s is the withi-lab estimate of σ o h = umber of factor (labs i this case) = umber of repeats i each factor ad h(-1) is the umber of degrees of freedom of this estimate. 4

I the above example, the sample meas arraged i icreasig order of size are: Lab 3 Lab 1 Lab Lab 4 97.4 101.0 103.3 108.7 The degree of freedom v = 4(3-1) = 8 ad hece t v= 8 =.306. The least sigificat differece therefore is.306 x 4.0 x (/3) with 95% cofidece, givig 3.78. Now, differece of Lab 3 ad Lab 1 = 101.0 97.4 = 3.6 differece of Lab 1 ad Lab = 103.3 101.0 =.3 differece of Lab ad Lab 4 = 108.7 103.3 = 5.4 Therefore whe we compare this LSD value with the differeces betwee the meas, we ote that : - the mea of Lab 4 differs sigificatly from those of Lab 1, ad 3; - the mea values of Lab 1, ad 3 do ot differ sigificatly from each other. 5