HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Similar documents
Non-parametric tests, part A:

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Nonparametric Statistics

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Chapter 7 Comparison of two independent samples

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Lecture 7: Hypothesis Testing and ANOVA

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Exam details. Final Review Session. Things to Review

Non-parametric (Distribution-free) approaches p188 CN

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Inferences About the Difference Between Two Means

Non-parametric methods

Comparison of two samples

Single Sample Means. SOCY601 Alan Neustadtl

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Types of Statistical Tests DR. MIKE MARRAPODI

Two-Sample Inferential Statistics

Statistics: revision

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

N Utilization of Nursing Research in Advanced Practice, Summer 2008

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Pooled Variance t Test

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

What is a Hypothesis?

Analysis of variance (ANOVA) Comparing the means of more than two groups

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

3. Nonparametric methods

Correlation and Simple Linear Regression

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Sampling Distributions: Central Limit Theorem

Rank-Based Methods. Lukas Meier

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Psychology 282 Lecture #4 Outline Inferences in SLR

Performance Evaluation and Comparison

NON-PARAMETRIC STATISTICS * (

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Nonparametric tests, Bootstrapping

Non-parametric Tests

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

SPSS Guide For MMI 409

Workshop Research Methods and Statistical Analysis

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Textbook Examples of. SPSS Procedure

Introduction to Statistical Data Analysis III

Introduction to Biostatistics: Part 5, Statistical Inference Techniques for Hypothesis Testing With Nonparametric Data

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

1.0 Hypothesis Testing

Chapter 18 Resampling and Nonparametric Approaches To Data

Statistical Inference Theory Lesson 46 Non-parametric Statistics

Selection should be based on the desired biological interpretation!

Comparing the means of more than two groups

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47

PLSC PRACTICE TEST ONE

Basic Business Statistics, 10/e

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Introduction to Statistical Analysis

Intro to Parametric & Nonparametric Statistics

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Module 9: Nonparametric Statistics Statistics (OA3102)

Difference between means - t-test /25

Data analysis and Geostatistics - lecture VII

STAT Chapter 8: Hypothesis Tests

Advanced Experimental Design

HYPOTHESIS TESTING. Hypothesis Testing

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

psychological statistics

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Inferences about central values (.)

Background to Statistics

2. RELATIONSHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE

Statistics Handbook. All statistical tables were computed by the author.

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Analysis of 2x2 Cross-Over Designs using T-Tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Introduction to Nonparametric Statistics

Statistiek I. Nonparametric Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

16. Nonparametric Methods. Analysis of ordinal data

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD

Unit 14: Nonparametric Statistical Methods

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Transcription:

HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă

OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2

SIGNIFICANCE LEVEL VS. p VALUE Materials and Methods Significance level (α) = property of a statistical procedure and takes a fixed value. Usually take a value equal to 0.05 p value = random variable whose value depends upon the composition of the individual sample Results 3

SIGNIFICANCE LEVEL VS. p VALUE http://www.ncbi.nlm.nih.gov/pmc/articles/pmc4129321/ 4

PARAMETRIC & NON PARAMETRIC Assumed distribution Assumed variance Type of data Central measure Dispersion measure 2 independent groups 2 dependent groups > 2 groups Correlation Parametric Normal Homogenous Ratio or Interval Mean Standard deviation Parametric Independent t test Paired t test ANOVA Pearson Non Parametric Any Any Ordinal or Nominal Median (Q1; Q3) Non Parametric Mann Whitney test Wilcoxon test Kruskal Wallis test Friedman s ANOVA Spearman, Kendall, etc.

HYPOTHESIS TESTING FOR A MEAN Hypothesis testing via CI One and two sided hypothesis tests

HYPOTHESIS Null Alternative H 0 H A /H 1 Not significantly different (= symbol) Significantly different two sided: symbol one sided: < symbol OR > symbol

HYPOTHESIS TESTING VIA CI n=33, m = 2.6, s = 1.2, SE = 0.21 95% CI for the average number exams failed by medical students in the first year of study is (1.7, 3.4). Based on this confidence interval, do these data support the hypothesis that medical students on average failed on average more than 2 exams? H 0 μ= 2 Medical student in first year of study failed 2 exams, on average H A μ> 2 Medical student in first year of study failed more than 2 exams,on average Always about population μ = 2 parameter, never about population statistics 1.7 3.4

HYPOTHESIS TESTING VIA CI n=33, m = 2.6, s = 1.2, SE = 0.21 P(observed or more extreme outcome H 0 true) P(m>2.6 H 0 : μ = 2) Test statistic: Z = (2.6 2)/0.21 = 2.8571 p value = P(Z>2.8571) = 0.0021 p value = the probability of observing data at least as favorable to the alternative hypothesis as our current data set, if the null hypothesis was true. If p value < 0.05 we say that it would be very unlikely to observe the data if the null hypothesis were true, and hence reject H 0. If the p value > 0.05 we say that it is likely to observe the data even if the null hypothesis were true, and hence fail to reject H 0.

HYPOTHESIS TESTING ON A SINGLE MEAN 1. Hypotheses: H 0 : μ = null value & H A : μ null value 2. Calculate the point estimator 3. Check conditions: Independence: observations are independent by each others Sample size: n 30 4. Draw sampling distribution, shade p value, calculate test statistic: Z = (m μ)/(s/ n) 5. Make a decision: p value < α reject H 0 (data provide convincing evidence for H A ) p value > α fail to reject H 0 (data do not provide convincing evidence for H A )

INDEPENDENT SAMPLES: ARE TWO MEANS THE SAME? Total sample size Large size (n>50 or n>100) or σ s known Subgroup sample size ~ equal very different Equal Variance Z test s Unequal Variances Rank sum test Small size ~ equal very different t test Rank sum test Assumptions: the observations are independent from each other; the samples are drawn from a normal distribution (use a Rank test when this assumption is violated); the standard deviation of samples are not statistically different by each other (apply an unequal variance form of the means test or a rank test). 11 16 Dec 2013

INDEPENDENT SAMPLES T TEST 12 16 Dec 2013

Z AND T TESTS TO COMPARE A SAMPLE MEAN WITH A POPULATION MEAN Z test When? Population standard deviation is known OR n > 50 (100) Hypotheses: m = μ (H 0 ) vs. m μ (H 1 ) Significance level (α = 0.05) critical value with n 1 df (degree of freedom) Test statistics: z = (m μ)/(σ/ n) where σ = standard deviation, n = sample size t test When? Unknown standard deviation OR n < 50 (100) Hypotheses: μ 1 = μ 2 (H 0 ) vs. μ 1 μ 2 (H 1 ) Significance level (α = 0.05) critical value with n 1 df (degree of freedom) Test statistics: t = (m μ)/(s/ n) where s = standard deviation, n = sample size 13 16 Dec 2013

Z AND T TESTS TO COMPARE A SAMPLE MEAN WITH A POPULATION MEAN PROBLEM 1 Z test μ = 0 (H 0 ) vs μ 0 (H 1 ) α= 0.05 σ= 1.75 n = 15 m = 3.87 Z crit = 1.96 T test μ = 0 (H 0 ) vs μ 0 (H 1 ) α= 0.05 s = 2.50 n = 15 m = 3.87 t crit = 2.145 Z =? Conclusion? Z =? Conclusion? 14 16 Dec 2013

Z AND T TESTS TO COMPARE TWO SAMPLE MEANS Z test When? Population standard deviation is knows OR n > 50 (100) Hypotheses: μ 1 = μ 2 (H 0 ) vs. μ 1 μ 2 (H 1 ) Significance level (α = 0.05) critical value with n 1 +n 2 1 df (degree of freedom) Test statistics: z = (m 1 m 2 )/σ d, where σ d = population standard error (σ d =σ*sqrt(1/n 1 +1/n 2 )) t test When? Unknown standard deviation OR n < 50 (100) Hypotheses: μ 1 = μ 2 (H 0 ) vs. μ 1 μ 2 (H 1 ) Significance level (α = 0.05) critical value with n 1 +n 2 1 df (degree of freedom) Test statistics: z = (m 1 m 2 )/s d, where s d = standard error 15 16 Dec 2013

T TESTS TO COMPARE TWO SAMPLE MEANS Age and prostate cancer t test Negative biopsy: n 1 =206, m 1 =66.59 years old, s 1 =8.21 Positive biopsy: n 2 =95, m 2 =67.14 years old, s 2 =7.88 σ = 8.10 (n=301) α = 0.05 t critic = 1.96 sd = sqrt((1/206+1/95) ((204*8.21 2 +94*7.88 2 )/(205+95-2))) = 1.0055 t = (m 1 -m 2 )/sd = (66.59-67.14)/1.0055 = -0.5470 (p-value = 0.582) -1.96-0.5470 1.96 we failed to reject the H 0 (The mean age of subjects with positive biopsy is not significantly different by the mean age of subjects with negative biopsy) For samples > 100 the difference between Z and t-statistic is negligible while the p-values are identical 16 16 Dec 2013

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES) Null hypothesis: Means difference of the two populations is not significantly different by zero. Alternative hypothesis for two tailed test: Means difference of the two populations is significantly different by equal. Assumptions: The variables in the two samples are normal distributed The variances are equal If these two assumptions are not satisfied the test loss its validity. If the variances of populations are known the Z test is applied (is most powerful) 17

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES) Degree of freedom (df): df = n 1 + n 2-2 Significance level: α = 0.05 Critical region for two tailed test + ; t α t α ; n + + 1 n2 2; n1 n2 2; 2 2 Statistics t s = = m s 1 1 n m 1 + 2 1 n 2 ( n 1) s + ( n 1) s n + n 2 2 2 1 1 2 2 1 2 18

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES) We want to study whether there is a significant difference between the amount of blood uric acid in women from urban and rural. In a sample of 16 women aged between 30 and 50 years in urban areas, average uric acid was 5 mg/100 ml, with a variance of 2 mg/100 ml. An average equal to 4 mg/100 ml with a variance of 2 mg/100 ml was obtained on a sample of 16 women aged 30 to 50 years in rural areas. 19

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES) Data: n 1 = 16; n 2 = 16 m 1 = 5; m 2 = 4 s 2 = 2 Null hypothesis: There is no significant difference between the two samples means. Alternative hypothesis for two sided test: There is a significant difference between the two samples means. Degree of freedom: df = n 1 +n 2 2 =16+16 2=30 Significance level: α = 0.05. Critical region for bilateral test: ( ; tn n 2;0.025] [tn n 2;0. 025; ) 1 + + 2 1 + 2 ( ; 2.04] [2.04; + ) 20

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES) s = t = (n m s 2 1) s1 + (n2 1) s n + n 2 1 n m 1 1 n 2 5 4 1 1.41 + 16 1 16 = 1 1.41 0.25 1 0.3525 1 0.5937 1 2 = = = = + 1 2 (16 1) 2 + (16 1) 2 16 + 16 2 60 30 2 1 2 = = = 1.41 1.68 Conclusion: Statistical: The null hypothesis is failed to be rejected since the statistics did not belongs to the critical region. Clinical: The serum level of uric acid is not significantly different in women from rural compared to those from urban areas. 21

STUDENT T-TEST FOR COMPARING TWO MEANS (UNKNOWN AND EQUAL VARIANCES)

http://www.sciencedirect.com/science/article/pii/s0950061810005568# 23

PAIRED SAMPLES STUDENT T-TEST FOR COMPARING MEANS Aim: comparing the means of two paired samples on quantitative continuous variable (paired means the observation of the same quantitative variable before and after the action of a factor) Assumptions: Individual observations from the first sample corresponds to a pair in the second sample The differences between pairs of values are normally distributed. Null hypothesis: The mean of difference of paired data is not significantly different by zero. Alternative hypothesis for two tailed test: The mean of difference of paired data is significantly different by zero. 24 16 Dec 2013

STUDENT (T) FOR COMPARING MEANS OF PAIRED SAMPLES Degrees of freedom (df): df = n 1 Significance level: α = 0.05 Critical region: ( ; t α n 1; 2 ] [t α n 1; 2 ; + ) Statistics d t = = d s n ( d + d +... d ) 1 2 + n n s = standard deviation of the differences n = sample size 25 16 Dec 2013

PAIRED STUDENT T-TEST 26 16 Dec 2013

PAIRED STUDENT T-TEST 27 16 Dec 2013

STUDENT T-TEST FOR COMPARING MEANS OF PAIRED SAMPLES Null hypothesis: There is no significant difference in systolic blood pressure before and after having used of oral contraceptives. Alternative hypothesis for two tailed test: There is a significant difference in systolic blood pressure before and after having used of oral contraceptives. Degrees of freedom: df = n 1 = 10 1 = 9 28 16 Dec 2013

STUDENT T-TEST FOR COMPARING MEANS OF PAIRED SAMPLES 13 + 3 1 + 9 + 7 + 7 + 6 + 4 2 + 2 48 d = = = 4.8 10 10 2 67.24+ 3.24+ 33.64+ (4.2) + 2 4.84+ 1.44+ 0.64+ 46.24+ 7.84 s= = 10 1 d 4.8 4.8 4.8 t = = = = = 3.15 s 4.57 4.57 1.52 n 9 3 187.60 = 9 20.84= 4.57 Conclusion (two sided test): Statistical: The null hypothesis is rejected since the statistics belongs to critical region. Clinical: The use of oral contraceptives is associated to a significant increase in systolic blood pressure. 29 16 Dec 2013

TESTS ON MEANS BY EXAMPLES http://www.hindawi.com/journals/tswj/2013/608683/tab2/ 30

http://www.hindawi.com/journals/tswj/2013/608683/tab2/ 31

Thank you for your attention!