Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1

Review from last week What is a cofidece iterval? 2

Review from last week What is a cofidece iterval? A iterval estimate of a parameter Repeated samplig would produce a set of CIs that would ecompass the true populatio parameter X% of the time 3

Relatioship betwee cosistecy ad bias Cosistet: as sample size icreases, gets closer to θ Ubiased: Cosistecy describes the behavior of a estimator i the limit. A cosistet estimator will coverge to the correct value. However, idividual estimators i the sequece (e.g. for < ifiity) may be biased, eve if the overall sequece is cosistet. 4

Relatioship betwee cosistecy ad bias Cosistet: as sample size icreases, gets closer to θ Ubiased: https://www.youtube.com/watch?v=21lxgc02xwm& ebc=anypxkqutamwhxqrd8teblk3_hrdyp3gaovhb kaa8shph2w6rxoumbqq- adm1buldyy2v54b6lcy5qpvbpt80g- 7fVHHgmVQ&ohtml5=False 5

Goals Itroductio to hypothesis testig Oe sample hypothesis tests Oe-sample t-test Two sample hypothesis tests Two-sample paired ad upaired t-test R sessio Doig t-tests 6

Itroductio to hypothesis testig 7

What is hypothesis testig? A statistical test examies a set of sample data ad, o the basis of a expected distributio of the data, leads to a decisio about whether to accept the hypothesis uderlyig the expected distributio or reject that hypothesis ad accept a alterative oe. Over the ext several weeks, we will come at this basic problem from may agles, but the geeral idea will remai the same 8

What is hypothesis testig? A statistical test examies a set of sample data ad, o the basis of a expected distributio of the data, leads to a decisio about whether to accept the hypothesis uderlyig the expected distributio or reject that hypothesis ad accept a alterative oe. Statistical tests rely o computig a test statistic (i.e. a statistic we will compare betwee samples; mea, etc) 9

What is hypothesis testig? A statistical test examies a set of sample data ad, o the basis of a expected distributio of the data, leads to a decisio about whether to accept the hypothesis uderlyig the expected distributio or reject that hypothesis ad accept a alterative oe. To geerate the expected distributio of the test statistic (ofte called the ull distributio) we ca use parametric distributios or, through radomizatio, geerate our ow based o the data 10

What is hypothesis testig? A statistical test examies a set of sample data ad, o the basis of a expected distributio of the data, leads to a decisio about whether to accept the hypothesis uderlyig the expected distributio or reject that hypothesis ad accept a alterative oe. The, we ca see what the likelihood of obtaiig the test statistic value we calculated from the sample. If it is highly ulikely, we might reject the ull hypothesis. 11

What is hypothesis testig? A statistical test examies a set of sample data ad, o the basis of a expected distributio of the data, leads to a decisio about whether to accept the hypothesis uderlyig the expected distributio or reject that hypothesis ad accept a alterative oe. The, we ca see what the likelihood of obtaiig the test statistic value we calculated from the sample. If it is highly ulikely, we might reject the ull hypothesis. 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 12

Hypothesis Testig Formally examie two opposig cojectures (hypotheses), H 0 ad H A These two hypotheses are mutually exclusive ad exhaustive so that oe is true to the exclusio of the other We accumulate evidece collect ad aalyze sample iformatio for the purpose of determiig which of the two hypotheses is true ad which of the two hypotheses is false 13

Example Cosider a geome-wide associatio study (GWAS) for T2D ad you measure the blood glucose level of the case/cotrol groups The ull hypothesis, H 0 : There is o differece betwee the case/cotrol groups i the mea blood glucose levels H 0 : μ 1 - μ 2 = 0 The alterative hypothesis, H A : The mea blood glucose levels i the case/cotrol groups are differet H A : μ 1 - μ 2 0 14

The Null ad Alterative Hypothesis The ull hypothesis, H 0 : States the assumptio (umerical to be tested) Begi with the assumptio that the ull hypothesis is TRUE Always cotais the = sig The alterative hypothesis, H A : Is the opposite of the ull hypothesis Challeges the status quo Never cotais just the = sig Is geerally the hypothesis that is believed to be true by the researcher 15

Oe ad Two Sided Tests Hypothesis tests ca be oe or two sided (tailed) Oe tailed tests are directioal: H 0 : μ 1 - μ 2 0 H A : μ 1 - μ 2 > 0 Two tailed tests are ot directioal: H 0 : μ 1 - μ 2 = 0 H A : μ 1 - μ 2 0 16

Oe-sided Test P-values Calculate a test statistic from the sample data that is relevat to the hypothesis beig tested e.g. I our GWAS example, the test statistic ca be determied based o μ 1, μ 2 ad σ 1, σ 2 computed from the GWAS data After calculatig a test statistic we covert this to a P- value by comparig its value to distributio of test statistic s uder the ull hypothesis Null distributio 0.0 0.1 0.2 0.3 0.4 Test statistic value Area = P-value (probability of obtaiig a test statistic value at least as extreme as the oe we observed, uder the ull hypothesis) 17

Two-sided Test P-values Calculate a test statistic i the sample data that is relevat to the hypothesis beig tested e.g. I our GWAS example, the test statistic ca be determied based o μ 1, μ 2 ad σ 1, σ 2 computed from the GWAS data After calculatig a test statistic we covert this to a P- value by comparig its value to distributio of test statistic s uder the ull hypothesis Null distributio 0.0 0.1 0.2 0.3 0.4 Test statistic value Area = P-value (probability of obtaiig a test statistic value at least as extreme as the oe we observed) 18

Whe To Reject H 0 Level of sigificace, α: Specified before a experimet to defie rejectio regio Rejectio regio: set of all test statistic values for which H 0 will be rejected Oe sided α = 0.05 Two sided α = 0.05 The test statistic value required to reject is ofte called the critical value 19

Whe To Reject H 0 Level of sigificace, α: Specified before a experimet to defie rejectio regio Rejectio regio: set of all test statistic values for which H 0 will be rejected Oe sided α = 0.05 Whe we obtai a test statistic i the rejectio regio we ca coclude that either the ull is true ad a highly improbable evet has occurred, or the ull is false 21

Absolutely true, for sure o doubt, p = 0.049 Nope! P = 0.051 22

Some Notatio I geeral, critical values for a α level test deoted as: Oe sided test: X α Two sided test: X α/2 where X idicates the distributio of the test statistic For example, if X ~ N(0,1): Oe sided test: z α (i.e., z 0.05 = 1.64) Two sided test: z α/2 (i.e., z 0.05/2 = z 0.05/2 = +-1.96) 23

Errors i Hypothesis Testig Level of sigificace, α: Specified before a experimet to defie rejectio regio Rejectio regio: set of all test statistic values for which H 0 will be rejected Oe sided α = 0.05 Give α= 0.05, what is the chace we will falsely reject the ull hypothesis? 24

Errors i Hypothesis Testig Actual Situatio Truth Decisio Do Not Reject H 0 Reject H 0 H 0 True Correct Decisio 1-α Icorrect Decisio Type I Error α H 0 False Icorrect Decisio Type II Error Β Correct Decisio 1-β 26

Type I ad II Errors Actual Situatio Truth Decisio Do Not Reject H 0 Reject H 0 H 0 True Correct Decisio 1-α Icorrect Decisio Type I Error α H 0 False Icorrect Decisio Type II Error Β Correct Decisio 1-β α = P(Type I Error) β = P(Type II Error) 28

What is a p-value Likelihood of obtaiig a test statistic at least as extreme as the oe we observed, give the assumptios we made are satisfied What is a p-value NOT? Not the probability that the ull hypothesis is true Not the probability that the fidig is a fluke Not the probability of falsely rejectig the ull Does ot idicate the size or importace of observed effects (cofouded by sample size) 29

No! What is a p-value NOT? Not the probability that the ull hypothesis is true Not the probability that the fidig is a fluke Not the probability of falsely rejectig the ull Does ot idicate the size or importace of observed effects (cofouded by sample size) 30

Cetral Dogma of Statistics Oe sample: comparig a sample statistic to a kow or assumed populatio parameter 31

Cetral Dogma of Statistics Two samples: comparig two sample statistics 32

Oe sample hypothesis tests Testig the possibility of gettig a result at least as extreme as from a samplig distributio dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 If > 30, CLT says we ca assume the samplig distributio is ormal, i which case we ca use a z- statistic If < 30, we caot assume the samplig distributio is ormal, i which case we ca use a t-statistic 4 2 0 2 4 seq( 5, 5, 0.01) 33

Oe sample hypothesis example Lets say that we kow a certai type of cell should have a average of 10 trascripts per cell. We measure 5 cells ad fid: 34

Oe sample hypothesis example Lets say that we kow a certai type of cell should have a average of 10 trascripts per cell. We measure 5 cells ad fid: 35

Oe sample hypothesis example Lets say that we kow a certai type of cell should have a average of 10 trascripts per cell. We measure 5 cells ad fid: dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 4 2 0 2 4 = 10 seq( 5, 5, 0.01) 36

Oe sample hypothesis example What is the p-value? dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 4 2 0 2 4 seq( 5, 5, 0.01) = 10 Usig pt(4.03, df = 4) i R, we fid that 99.2% of t-values are below 4.03. 43

Oe sample hypothesis example Iterpretatio? dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 4 2 0 2 4 seq( 5, 5, 0.01) = 10 Usig pt(4.03, df = 4) i R, we fid that 99.2% of t-values are below 4.03. Hece, p = 0.016 (two-tailed). 44

Oe sample hypothesis example Iterpretatio? We expect that 0.16% of the time we would get sample meas as extreme or more extreme tha 12.5. dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 4 2 0 2 4 seq( 5, 5, 0.01) = 10 Usig pt(4.03, df = 4) i R, we fid that 99.2% of t-values are below 4.03. Hece, p = 0.0016. 45

Oe sample hypothesis example Iterpretatio? Reject the ull hypothesis dorm(seq( 5, 5, 0.01), sd = 1.5) 0.00 0.05 0.10 0.15 0.20 0.25 4 2 0 2 4 seq( 5, 5, 0.01) = 10 Usig pt(4.03, df = 4) i R, we fid that 99.2% of t-values are below 4.03. Hece, p = 0.016. 46

T-Statistic Defiitio: t = x µ x s / where, x = x + x +! + Iterpretatio: The umber of (estimated) stadard deviatios of the sample mea from its expected value μ x 1 2 2 s = 1 i= 1 1 ( x i x) 2 The quatity (-1) is called the degrees of freedom of the t value (we already leared that we lose a degree of freedom whe we use the sample mea to calculate the variace) 47

Studet s t-distributio The t-values follow the t-distributio x µ t = s / P(t ) t df = degrees of freedom 48

Studet s t-distributio W.S. Gosset (1876-1937) was a modest, well-liked Eglishma who was a brewer ad agricultural statisticia for the famous Guiess brewig compay i Dubli. It isisted that its employees keep their work secret, so he published uder the pseudoym Studet the distributio i 1908. This was oe of the first results i moder small-sample statistics. 49

Two Sample t-test Paired two-sample t-test: There are two samples of the same size (say umbers) The correspodig umbers pair aturally Examples Before-ad-after pairs of measuremets after givig a drug Expressio levels of gees i paired samples (sibligs, etc) Upaired two-sample t-test: Two samples might eve have differet umbers of poits (say 1 ad 2, respectively) There is o atural pair 50

Paired Two-Sample t-test Give the followig data (expressio levels of gees): x 1, x2,..., y 1, y2,..., x y Measure whether the after member of the pair is differet from the before member d 1, d2,..., d Before drug treatmet After drug treatmet After drug treatmet Differece d i = x i - y i Frame hypotheses Null hypothesis H 0 : The mea of this sample of differeces is 0 Alterative hypothesis H A : The mea is ot 0 Look familiar? Just a oe-sample t-test! 51

U-Paired Two-Sample t-test Suppose that two samples are draw idepedetly x 1, x2,..., x y y, y,..., 1, 2 3 y m There is o coectio betwee poit 18 from oe sample, ad poit 18 from aother 52

U-Paired Two-Sample t-test Suppose that two samples are draw idepedetly x, x2,..., y y, y,..., 1, 2 3 x 1 Assume a ormal samplig distributio with mea μ x y m Assume a ormal samplig distributio with mea μ y Are the samples draw from distributios with the same mea or differet meas? 53

U-Paired Two-Sample t-test Supposed that two samples are draw idepedetly x, x2,..., y y, y,..., 1, 2 3 x 1 Assume a ormal samplig distributio with mea μ x y m Assume a ormal samplig distributio with mea μ y Are the samples draw from distributios with the same mea or differet meas? Frame hypotheses: Null hypothesis H 0 : The meas of the two samples are equal μ x = μ y Alterative hypothesis H A : Not equal μ x μ y 54

Theoretically The distributio of the sample mea differece, 2 2 σ σ x y x y ~ N( µ x µ y, + ) m x y 55

Theoretically The distributio of the sample mea differece, Let s thik about how the t-value should be defied here 56 ), ( ~ 2 2 m N y x y x y x σ σ µ µ + x y m y x t y x y x 2 2 ) ( σ σ µ µ + = Uder the ull hypothesis, μ x = μ y

Theoretically The distributio of the sample mea differece, Let s thik about how the t-value should be defied here As before, we usually have to use the sample variace because we do t kow the true variace 57 ), ( ~ 2 2 m N y x y x y x σ σ µ µ + x y m y x t y x y x 2 2 ) ( σ σ µ µ + = Uder the ull hypothesis, μ x = μ y

U-Pooled Variaces Just replace the true variaces with sample variaces t = x y 2 sx + s 2 y m 58

U-Pooled Variaces Just replace the true variaces with sample variaces t = x 2 sx + The t-statistic has the Studet s t-distributio with degrees of freedom v y s 2 y m 59

U-Pooled Variaces Just replace the true variaces with sample variaces t = x 2 sx + The t-statistic has the Studet s t-distributio with degrees of freedom v y s 2 y m It is complicated to figure out v here! A good approximatio is give as harmoic mea 2 1 1 + m 60

Two-Sample t-test Hypothesis testig Null hypothesis H 0 : No differece μ x = μ y Alterative hypothesis H A : Differet! μ x μ y t-statistic: where The degrees of freedom is (+m-2) 61 m s y x t p 1 1 + = 1) ( 1) ( ) ( ) ( 1 2 1 2 2 + + = = = m y y x x s m i i i i p

T-tests goe wrog Review of 30 studies that compared kee/hip replacemets 62

T-tests goe wrog Review of 30 studies that compared kee/hip replacemets Some were bilateral, treatig oe leg ad usig the other as a cotrol. Others compared treated ad utreated idividuals. 63

T-tests goe wrog Review of 30 studies that compared kee/hip replacemets Some were bilateral, treatig oe leg ad usig the other as a cotrol. Others compared treated ad utreated idividuals. What type of t-test is appropriate i each case? 64

A Practical Summary T-tests are for comparig oe sample mea to a parametric mea or comparig two sample meas. 68

A Practical Summary T-tests are for comparig oe sample mea to a parametric mea or comparig two sample meas. If you have multiple samples, stay tued for ANOVA. Do t do a buch of pairwise t-tests! 69

A Practical Summary T-tests are for comparig oe sample mea to a parametric mea or comparig two sample meas. If you have multiple samples, stay tued for ANOVA. Do t do a buch of pairwise t-tests! T-tests have two assumptios: The samplig distributios are assumed to be ormal. Because of CLT this is geerally true for >30. For <30 CLT does t apply so the populatio itself should be ormally distributed. You ca check this assumptio usig the Kolmogorov-Smirov test. The two samples are assumed to have the same variace. You ca check this assumptio usig Bartlett s test. While importat, the t-test is more robust to uequal variaces tha weirdly distributed data. 71

R Sessio Try out the t-test Lear about Q-Q plots Try out the Kolmogorov-Smirov ad Bartlett tests 72

Stadard Error of the Mea SEM is the stadard deviatio of the samplig distributio of the mea Ofte cofused with stadard deviatio of a sample i the literature. The stadard deviatio is a descriptive statistic, but SEM describes the bouds of the samplig distributio. SEM is a estimate of how far the sample mea is likely to be from the populatio mea SD of the sample is the degree to which idividuals withi a sample differ from the sample mea 73

Pooled Variaces If you assume that the variace is the same i both groups, you ca pool all the data to estimate a commo variace. This maximizes your degrees of freedom (ad thus your power) The t-statistic is the defied as: t = x y x y = 2 2 s s 1 1 p p + s p + m m 74

Pooled Variace Poolig variaces: 75 1 ) ( 1 2 2 = = x x s i i x 1 ) ( 1 2 2 = = m y y s m i i y = = i i x x x s 1 2 2 ) ( 1) ( = = m i i y y y s m 1 2 2 ) ( 1) ( 2 ) ( ) ( 1 2 1 2 2 + + = = = m y y x x s m i i i i p 2 1) ( 1) ( 2 2 2 + + = m s m s s y x p Degrees of freedom

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ 76

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ 77

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ Is a property of RV 78

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ Is a property of RV 79

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ Is a property of RV 80

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ Is a property of RV 81

Stadard Error of the Mea are idepedet obs from a pop. with mea μ ad stdev σ Is a property of RV 82