UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Similar documents
2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Common Large/Small Sample Tests 1/55

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Frequentist Inference

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

University of California, Los Angeles Department of Statistics. Hypothesis testing

1036: Probability & Statistics

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

1 Inferential Methods for Correlation and Regression Analysis

Topic 9: Sampling Distributions of Estimators

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Data Analysis and Statistical Methods Statistics 651

This is an introductory course in Analysis of Variance and Design of Experiments.

Chapter two: Hypothesis testing

STAT431 Review. X = n. n )

STATISTICAL INFERENCE

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

MATH/STAT 352: Lecture 15

Sample Size Determination (Two or More Samples)

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Sample questions. 8. Let X denote a continuous random variable with probability density function f(x) = 4x 3 /15 for

Last Lecture. Wald Test

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

6 Sample Size Calculations

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Stat 319 Theory of Statistics (2) Exercises

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Chapter 6 Sampling Distributions

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Expectation and Variance of a random variable

A statistical method to determine sample size to estimate characteristic value of soil parameters

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Read through these prior to coming to the test and follow them when you take your test.

Final Examination Solutions 17/6/2010

MA238 Assignment 4 Solutions (part a)

Properties and Hypothesis Testing

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statistics 511 Additional Materials

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

5. Likelihood Ratio Tests

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

(7 One- and Two-Sample Estimation Problem )

Statistics 20: Final Exam Solutions Summer Session 2007

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Module 1 Fundamentals in statistics

LESSON 20: HYPOTHESIS TESTING

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Math 140 Introductory Statistics

Random Variables, Sampling and Estimation

Statistical inference: example 1. Inferential Statistics

Chapter 4 Tests of Hypothesis

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

Chapter 8: Estimating with Confidence

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Chapter 13, Part A Analysis of Variance and Experimental Design

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

Lesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67

Statistical Intervals for a Single Sample

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Topic 18: Composite Hypotheses

Stat 421-SP2012 Interval Estimation Section

Chapter 1 (Definitions)

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Transcription:

UCLA TAT 110B Applied tatistics for Egieerig ad the cieces Istructor: Ivo Diov, Asst. Prof. I tatistics ad Neurology Teachig Assistats: Bria Ng, UCLA tatistics Uiversity of Califoria, Los Ageles, prig 003 http://www.stat.ucla.edu/~diov/courses_studets.html REVIEW Estimatio (ch. 6, 7) Hypothesis Testig (ch. 8) Two Importat Aspects of tatistical Iferece Poit Estimatio Estimate a ukow parameter, say θ, by some statistic computed from the give data which is referred to as a poit estimator. : is a poit estimate of σ Iterval Estimatio A parameter is estimated by a iterval that we are reasoably sure cotais the true parameter value. : A 95% cofidece iterval for θ Hypothesis Testig Test the validity of a hypothesis that we have i mid about a particular parameter usig sample data. lide 1 lide Cofidece Itervals for the Mea, µ Normally Distributed Populatio If σ kow costruct with ormal distributio If σ ukow ad < 30 costruct with studet s T distributio Arbitrarily Distributed Populatio - If >> 30 apply Cetral Limit Theorem ad use ormal distributio Cofidece Iterval for µ from a Normally Distributed Populatio, σ kow X µ Z ~ N(0,1) σ Fid the value z α/ such that: X µ P α α z < < z α σ 1 lide 3 lide 4 cot. Costruct a 90% cofidece iterval for the mea of a ormally distributed populatio specified by x 5, σ 4, 5 Costruct a 99% cofidece iterval for the mea of a ormally distributed populatio specified by x 5, σ 4, 5 lide 5 lide 6

cot. Costruct a 99% cofidece iterval for the mea of a ormally distributed populatio specified by x 5, σ 4, 100 Cofidece Iterval for µ from a Normally Distributed Populatio, σ ukow T X µ t α, 1 ~ t 1 Fid the value t α/,-1 such that: P t X µ < < t α, 1 α, 1 1 α lide 7 lide 8 Costruct a 90% cofidece iterval for the mea of a ormally distributed populatio specified by x 5, 4, 10 lide 9 T Large ample (>>30) Cofidece Iterval for µ from a Arbitrarily Distributed Populatio Apply Cetral Limit Theorem ice is large, the T-distributio limits to the stadard ormal. Hece, use a stadard ormal whe computig cofidece itervals regardless of whether σ is kow or ukow. df X X 1 ~ To µ Zo µ ~ N(0,1) σ lide 10 cot. Costruct a 90% cofidece iterval for the mea of a arbitrarily distributed populatio specified by x 5, 4, 35 Parameter (Poit) Estimatio (6.) Two Ways of Proposig Poit Estimators Method of Momets (MOMs): et your k parameters equal to your first k momets. olve. (e.g., Biomial, Expoetial ad Normal) Method of Maximum Likelihood (MLEs): 1. Write out likelihood for sample of size.. Take atural log of the likelihood. 3. Take partial derivatives with respect to your k parameters. 4. Take secod derivatives to check that a maximum exists. 5. et 1 st derivatives equal to zero ad solve for MLEs. e.g., Biomial, Expoetial ad Normal lide 11 lide 1

Parameter (Poit) Estimatio uppose we flip a coi 8 times ad observe {T,H,T,H,H,T,H,H}. Estimate the value p P(H). Method of Momets Estimate p^: et your k parameters equal to your first k momets. Let X {# T s} p8pe(x) ample#h s 5 p^5/8. Method of Maximum Likelihood Estimate p^: 1. f(x 8 5 3 p (1 likelihood fuctio. 5 8. 5 3 8 l p (1 l + 5 l( + 3 l( 1 5 5 3. 8 d l + 5 l( + 3 l( 1 5 d p 5(1 3p 0 p 5 8 5 3 0 p 1 p lide 13 Hypothesis Testig the Likelihood Ratio Priciple Let {X 1,, X }{0.5, 0.3, 0.6, 0.1, 0.}, weights, be IID N(µ, 1) f(x;µ). Joit desity is f(x 1,,x ; µ)f(x 1 ;µ)x xf(x ;µ). The likelihood fuctio L( f(x 1,,X ; L( µ ) λ( x,..., x ) e l( L) (0.5 µ ) + (0.3 µ ) + (0.6 µ ) + (0.1 µ ) + (0. µ ) d l( L) 0 (0.5 µ ) (0.3 µ ) (0.6 µ ) (0.1 µ ) (0. µ ) + 10 1 (0.5 µ) + (0.3 µ) + (0.6 µ) + (0.1 µ ) + (0. µ) dµ µ 3.4 µ 0.34 lide 14 Hypothesis Testig I ay problem there are two hypotheses: Null Hypothesis, H o Alterative Hypothesis, H a We wat to gai iferece about H a, that is we wat to establish this as beig true. Our test results i oe of two outcomes: Reject H o implies that there is good reaso to believe H a true Fail to reject H o implies that the data does ot support that H a is true; does ot imply, however, that H o is true Hypothesis Testig - Motivatio Poit Estimates do t mea a thig uless you kow how reliable the measuremet is. Reportig a iterval estimate at a certai level of cofidece is a simple way to express ucertaity i your estimates. Hypothesis testig is about makig oe of two coclusios, reject or fail to reject, about a specified hypothesis, while kowig somethig about the probabilities of the two types of errors i your coclusio. The type I error you cotrol by choosig α ad your rejectio regio. If the sample size is fixed, the probability of a type II error ca be foud assumig a certai alterative is true. If the sample size has t bee determied, you ca fid a sample size sufficiet to esure the probability of a type II error is below a desired level for a certai alterative. lide 15 lide 16 Hypothesis Testig - Motivatio Hypothesis Testig teps i Geeral: 1. Idetify parameter of iterest. Describe it i cotext.. Determie Null Value ad tate Null Hypothesis. 3. Determie alterative value/regio ad state ull hypothesis. 4. Write Test tatistic without eterig sample quatities. 5. tate α ad rejectio regio. 6. Calculate Test tatistic usig ecessary sample quatities. 7. tate coclusio (reject or fail to reject) ad iterpret i cotext. Hypothesis Testig - Motivatio What Test Do I Use whe? 1. X ~ N(µ ukow, σ kow ) oe-sample Z. X ~ N(µ ukow, σ ukow ) oe-sample T 3. X ~ D(µ ukow, σ ukow ), where D is fairly symmetric ad is moderately big oe-sample T 4. X ~ D(µ ukow, σ ukow ), where D is ot symmetric ad is really big oe-sample T 5. X ~ D(µ ukow, σ ukow ), where D is ot symmetric ad is ot big o-parametric, e.g. sig test. 6. X ~ Bi( kow,p ukow ) Z test for proportios lide 17 lide 18

Hypothesis Testig - Motivatio The above situatios each have their correspodig power calculatios ad cofidece itervals. I cases 1-4, the cofidece itervals ca be used to aswer the hypothesis testig questio. However, i case 6 the cofidece itervals should ot be used to aswer the hypothesis testig questio. Hypothesis Testig tatistical vs. Practical igificace lide 19 lide 0 Is a secod child geder iflueced by the geder of the first child, i families with >1 kid? First ad ecod Births by ex Aalysis of the birth-geder data data summary ecod Child Male Female First Child Male 3,0,776 Female,60,79 Total 5,8 5,568 Research hypothesis eeds to be formulated first before collectig/lookig/iterpretig the data that will be used to address it. Mothers whose 1 st child is a girl are more likely to have a girl, as a secod child, compared to mothers with boys as 1 st childre. Data: 0 yrs of birth records of 1 Hospital i Aucklad, NZ. lide 1 ecod Child Group Number of births Number of girls 1 (Previous child was girl) 541 79 (approx. 51.6%) (Previous child was boy) 5978 776 (approx. 46.4%) Let p 1 true proportio of girls i mothers with girl as first child, p true proportio of girls i mothers with boy as first child. Parameter of iterest is p 1 -p. H 0 : p 1 -p 0 (skeptical reactio). H a : p 1 -p >0 (research hypothesis) lide Hypothesis testig as decisio makig Decisio Makig Decisio made H 0 is true H 0 is false Accept H 0 as true OK Type II error Reject H 0 as false Type I error OK lide 3 Actual situatio ample sizes: 1 541, 5978, ample proportios (estimates) pˆ 79 / 541 0.5159, pˆ 776 / 5978 0.4644, 1 H 0 : p 1 -p 0 (skeptical reactio). H a : p 1 -p >0 (research hypothesis) Aalysis of the birth-geder data amples are large eough to use Normal-approx. ice the two proportios come from totally diff. mothers they are idepedet use formula 8.5.5.a Estimate - HypothesizedValue t 5.49986 0 E pˆ pˆ 0 pˆ pˆ 1 1 ˆ (1 ˆ ) ˆ (1 ˆ ) ˆ ˆ p p p p E p p 1 1 1 + 1 P value Pr( T t ) 1.9 10 8 0 lide 4

Aalysis of the birth-geder data We have strog evidece to reject the H 0, ad hece coclude mothers with first child a girl a more likely to have a girl as a secod child. How much more likely? A 95% CI: CI (p 1 -p ) [0.033; 0.070]. Ad computed by: estimate ± z E pˆ pˆ ± 1.96 E ˆ ˆ p p 1 1 pˆ (1 pˆ ) pˆ (1 pˆ ) pˆ pˆ ± 1.96 1 1 + 1 1 0.0515 ± 1.96 0.0093677 [3% ;7%] lide 5 Hypothesis Testig the Likelihood Ratio Priciple Let {X 1,, X } be a radom sample from a desity f(x;, where p is some populatio parameter. The the joit desity is f(x 1,, x ; f(x 1 ; x xf(x ;. The likelihood fuctio L( f(x 1,, X ; Testig: H o : p is i Ω vs H a : p is i Ω a, where ΩI Ω a 0 Fid max of L( i Ω. max L( p Ω Fid max of L( i Ω a. λ( x 1,..., x) max L( Fid likelihood ratio p Ω a Reject H o if likelihood-ratio statistics λ is small (λ<k) lide 6 Hypothesis Testig the Likelihood Ratio Priciple Let {X 1,, X }{0.5, -0.3, -0.6, 0.1, 0.} be IID N(µ, 1) f(x;µ). The joit desity is f(x 1,,x ; µ)f(x 1 ;µ)x xf(x ;µ). The likelihood fuctio L( f(x 1,,X ; Testig: H o : σ>0.9 is i Ω vs H a : σ<0.9 Reject H o if likelihood-ratio statistics λ is small (λ<k) max L( p Ω l(umer) quadratic i µ! λ ( 1,..., ) 0 λ x x max L( l(deo) quadratic i µ! p Ω a Maximize both fid ratio (0.5 µ) + ( 0.3 µ ) + ( 0.6 µ ) + (0.1 µ ) + (0. µ) max e µ > 0 (0.5 µ) + ( 0.3 µ ) + ( 0.6 µ ) + (0.1 µ ) + (0. µ) max e µ 0 lide 7 Let P(Type I) α t o 1/λ ο ~ t α,df4 oe-sample T-test Test Procedure 1. Calculate a Test tatistic (: z o ). pecify a Rejectio Regio (: z o > z α ) 3. The ull hypothesis is rejected iff the computed value for the statistic falls i the rejectio regio lide 8 Type I ad Type II Errors Type I ad Type II Errors α β Pr{ Reject H0 H0 is true} Pr{ Fail to Reject H0 H0 is False} The value of α is specified by the experimeter The value of β is a fuctio of α,, ad δ (the differece betwee the ull hypothesized mea ad the true mea). For a two sided hypothesis test of a ormally distributed populatio β Φ Z α δ + Φ Z σ δ + σ It is ot true that α 1- β (RHthis is the test power!) lide 9 α Let µ deote the tread life of a certai type of tire. We would like to test whether the advertised tread life of 50,000 miles is accurate based o a sample of 5 tires from a ormally distributed populatio with σ 1500. What are the ull ad alterative hypothesis? From these samples, we computed x 49000 Perform this hypothesis test at a level of sigificace α0.05 (type I error). What is our coclusio? Now, assume that the true mea is actually 49,50 miles. Give samples of size 5, what is the probability of a type II error, i.e. Pr(fail to reject the ull hypothesis give that it is false)? lide 30

Type I ad Type II Errors H o : µ 50,000 miles, sample-size 5, Normal distributio with σ 1500. H 1 : µ < 50,000. α0.05 (type I error). Assume true µ 49,50 miles. What is the P(type II error)? X - ~ N(50,000; 300 ) α0.05 Z(X - - µ)/σ -1.64 X - Z*σ + µ 49,508 β P(fail to reject H o give that H o is false)? There is a differet β for each differet true mea µ. β P(X - avg.miles > µ 49,508 X - ~N(49,50; 1,500 /5) ) Z(49,508-49,50)/300 0.86 β P( Z > 0.86) 0.1939 Power of Test 1- β 0.8061 Aother Type I ad Type II Errors Oe type of car is kow to sustai o visible damage i 5% of 10-mph crash tests.a ew bumper is proposed that icreases this proportio. Let p be the ew proportio of cars with o damage usig the ew bumpers. H o : p0.5, H 1 : p>0.5. X umber of crushes/test with o damage i 0 experimets. Uder H o we expect to get about *p5 o damage tests. uppose we d ivest i ew bumper techology if we get > 8 o damage tests rejectio regio R{8,9, 0}. Fid α ad β. How powerful is this test? lide 31 lide 3 Aother Type I ad Type II Errors H o : p0.5, H 1 : p>0.5. X umber of crushes/test with o damage i 0 experimets. X~Biomial(0, 0.5). Rejectio regio R{8,9, 0}. Fid α P(Type I) P(X>8 whe X~Biomial(0, 0.5)). Use OCR resource α 1-0.898 0.10 Fid β(p0.3) P (Type II) P(ca t reject H o X~Biomial(0, 0.3))P(X<7 X~Biomial(0,0.3)) Use OCR resource β 0.77 Fid β(p0.5) P (Type II) P(ca t reject H o X~Biomial(0, 0.5))P(X<7 X~Biomial(0,0.5)) Use OCR resource β 0.13 Comparig Populatio Meas Geeralize cofidece itervals ad tests for a sigle populatio parameter to that of two populatio parameters Cosider two populatios with meas µ 1 ad µ. We wat to estimate µ 1 - µ, or possibly test H o : µ 1 µ s: µ 1 average crop yield usig fertilizer 1 µ average crop yield usig fertilizer µ 1 wome s average height µ me s average height lide 33 lide 34 Assumptios Expectatio ad Variace of X Y X 1,X,,X m is a radom sample from a populatio with mea µ 1 ad variace σ 1 Y 1,Y,,Y is a radom sample from a populatio with mea µ ad variace σ The X ad Y samples are idepedet of oe aother We will ivestigate usig X Y as a estimator of the differece i the meas µ 1 - µ lide 35 lide 36

Test Procedures ad Cofidece Itervals for Normal Populatios with Kow Variaces (9.1) If both samples have a ormal distributio, the the test statistic X Y has a ormal distributio as well. It may be stadardized by Test Procedures ad Cofidece Itervals for Large amples (9.1) Whe both samples are large (>30 ad m>30): The CLT guaratees that regardless of the distributio of the data, X ad Y will have a Normal distributio. The estimated stadard deviatios will be close to the populatio stadard deviatios lide 37 lide 38 Use the followig data to costruct a 95% cofidece iterval for µ 1 - µ. m 45, x 4500, s 00, 45, y 40400, s 1 1900 Cosider the followig data regardig boredom. The followig table records the sample mea ad stadard deviatio of the Boredom Proeess Ratig for 97 male ad 148 female college studets surveyed. Test the hypothesis that the mea ratig is higher for me tha wome at a.05 level of sigificace. Geder N Avg ampld Male 97 10.4 4.83 Female 148 9.6 4.68 lide 39 lide 40 Two ample t Test ad Cofidece Iterval (9.) The populatio variaces are ukow At least oe of the samples has a small sample size Assume each populatio is ormally distributed. Experimetally, this may be established through ormal probability plots. X ad Y are stadardized ad distributed accordig to a t distributio Cosider the followig stress limits for differet types of woods. Test the hypothesis that the true average stress limit for red oak exceeds that of Douglas fir by 1MPa Type N Avg ampld Red Oak 14 8.48 0.79 Douglas Fir 10 6.65 1.8 lide 41 lide 4

Pooled t Procedures Not covered by the curret editio of the book Assumes that the populatio variaces are equal, i.e. σ 1 σ Outperforms the two sample t-test i β for a give level of α if the hypothesized equality of variaces is true. ame is true for the cofidece itervals May give erroeous results, however, if the variaces are ot equal, i.e. ot robust to violatios of this assumptio Comparig two meas for idepedet samples uppose we have samples/meas/distributios as follows: { x, N( µ, σ )} ad { x, N( µ, σ ) }. We ve 1 1 1 see before that to make iferece about µ we 1 µ ca use a T-test for H 0 : µ µ 0 with ( x 1 x ) 0 1 t 0 Ad CI( µ ) 1 µ E( x 1 x ) x 1 x ± t E( x 1 x ) If the samples are idepedet we use the E formula E s / + s / with df Mi( 1; 1. ) 1 1 1 This gives a coservative approach for had calculatio of a approximatio to the what is kow as the Welch procedure, which has a complicated exact formula. lide 43 lide 44 Meas for idepedet samples equal or uequal variaces? Pooled T-test is used for samples with assumed equal variaces. Uder data Normal assumptios ad equal variaces of ( x 1 x 0) / E( x 1 x ), where ( 1) s + ( 1) s E s 1 1 p 1/ + 1/ ; s 1 p + 1 is exactly tudet s t distributed with df ( + ) 1 Here s p is called the pooled estimate of the variace, sice it pools ifo from the samples to form a combied estimate of the sigle variace σ 1 σ σ. The book recommeds routie use of the Welch uequal variace method. lide 45 Aalysis of Paired Data (9.3) Paired Data - Oe set of idividuals or objects; two observatios made o each idividual. Upaired Data -Two idepedet sets of idividuals or objects; oe observatio per idividual lide 46 Cosider testig whether a ew drug sigificatly lowers blood pressure usig 0 radomly selected patiets Upaired Data Radomly select 10 patiets for the drug (1) ad 10 for the placebo (). Observe the magitude of the reductio i blood pressure after takig medicatio. Test H o : µ 1 µ vs. H a : µ 1 > µ usig two-sample t- test What about the age of the persos selected? Youger people may be more susceptible to a decrease i blood pressure tha are older people. Ca use pairig to block out age effect. lide 47 lide 48

The Paired t-test Assume that the data cosists of idepedetly selected pairs (X 1,Y 1 ), (X,Y ),,(X,Y ) Defie D 1 X 1 -Y 1, D X -Y,, D X -Y. The D i s are the differeces withi pairs. Check that the D i s are ormally distributed usig a ormal probability plot. Let D X Y The µ D Hece testig H o : µ D is equivalet to testig H o : µ 1 - µ ice the D i s are idepedet ad ormally distributed R.V.s, we ca use a oe sample t-test to test the above hypothesis lide 49 lide 50 Let d ad s D be the sample mea ad sample stadard deviatio. It follows that the Cofidece Iterval ad Hypothesis Test for the paired t-test are Paired t- vs. Two-ample t-test The paired t-test has fewer degrees of freedom tha the two-sample t-test. Hece, the twosample t-test has a smaller β error for a fixed level of α tha does the paired t-test. However, if there is a positive correlatio betwee experimetal uits, the paired t-test will reduce the variace accordigly resultig i a more sigificat T statistic, where the two-sample t-test does ot. lide 51 lide 5 Paired t- vs. Two-ample t-test Use paired t-test if: There is great heterogeeity betwee experimetal uits ad a large correlatio withi pairs Use Two-ample t-test if: The experimetal uits are relatively homogeous ad the correlatio betwee pairs is small Ifereces Cocerig the Differece i Populatio Proportios (9.4) Previous sectios (9.1,,3): We compared the differece i the meas (µ 1 - µ ) of two differet populatios This sectio (9.4): We compare the differece i the proportios (p 1 p ) of two differet populatios lide 53 lide 54