Introduction to hypothesis testing
|
|
- Mark Cobb
- 5 years ago
- Views:
Transcription
1 Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If hypothesis (H A )is that an experimental treatment has an effect: null hypothesis is that there is no effect Disproving H 0 = evidence that actual hypothesis is true
2 Decision criterion How low a probability should make us reject H 0? If probability is less than significance level (critical p-value, ), then reject H 0 ; otherwise do not reject Convention sets significance level: = 0.05 (5%) Arbitrary: other significance levels might be valid. Context specific Three special types of Hypothesis Tests based on the t distribution 1. The mean of a distribution is different from a constant (one sample t test) 2. The mean difference in pairs of observations is different from a constant (paired t test) 3. Two distributions differ (i.e. the means from two sets of observations do not come from the same distribution of means). Two sample t test.
3 t statistic General form of t statistic: S t SE where S t is sample statistic, is parameter value specified in H 0 and SE is standard error of sample statistic. Specific form for population mean: y s n Value of mean specified in H 0 Test statistics Sampling distributions of t, one for each sample size, when H 0 true use degrees of freedom (df = n -1) Area under each sampling (probability) distribution equals one Probabilities of obtaining particular ranges of t when H 0 is true
4 Three special types of Hypothesis Tests based on the t distribution 1. The mean of a distribution is different from a constant. One sample t test 2. The mean difference in pairs of observations is different from a constant. Paired t test. 3. Two distributions differ (ie the means from two sets of observations do not come from the same distribution of means). Two sample t test. Simple null hypothesis Test of hypothesis that population mean equals a particular value (H 0 : = ) These values may be from literature or other research or legislation
5 One sample t-test Mean(B_To_D) Europe Islamic NewWorld Group Populations are fairly stable if the ratio of births to deaths is close to H o : B/D ratios = 1.25 H A : B/D ratios = ) Are the B/D ratios for any of these groups =1.25 2) Test using a one sample t-test Ourworld t statistic General form of t statistic: S t SE where S t is sample statistic, is parameter value specified in H 0 and SE is standard error of sample statistic. Specific form for population mean: y s n Value of mean specified in H 0
6 One sample t-tests Single population: H 0 : = 0 (or any other pre-specified value: here 1.25) t y 1.25 s y df = n -1 y 1.25 s n Results Europe Box plot 2. Normal approximation 3. Histogram Probability
7 More Results Islamic Test Mean Hypothesized Value Actual Estimate DF Std Dev Test Statistic Prob > t Prob > t Prob < t t Test <.0001* <.0001* New World Test Mean Hypothesized Value 1.25 Actual Estimate DF 20 Std Dev Test Statistic Prob > t Prob > t Prob < t t Test <.0001* <.0001* Even more a way to present the results 8 Births / deaths (95% CI) Ho:
8 Two sample t- test Used to compare two populations, each of which has been sampled The simplest form of tests among multiple populations Example: does the average annual income differ for males and females: Ho: income (males) = income (females) Female Male Survey2 SEX Calculation: H 0 : 1 = 2, i.e. 1-2 = 0 - independent observations t y 1 y2 ( 1 2) s y y 1 2 y 1 y s y y y 1 y s + p n 1 n 2 Where s p = the pooled standard deviation (more later), and df = (n 1-1) + (n 2-1) = n 1 + n 2-2
9 Logic of the two sample t test Assume H o : = 2 H A : > 2 1) If H o is true then the null distribution is known (for a set df) 2) If H A is true, we don t know the distribution but we do know that it is not the null distribution Probability of t H o true t = Central t s p y 1 H A true Non- Central t y n 1 n Assume: H o : = 2, 4 df H o true t 0.05, 4 df = t = s p y 1 y n 1 n 2 Any t >2.14 will lead to incorrect rejection of H o 1. This means that the difference between y 1 and y 2 is > than 2.14 standard errors (pooled) 2. This will happen 5 % of the time
10 Assume: H A : > 2, 4 df H A true t 0.05, 4 df = t = s p y 1 y n 1 n 2 Any t < 2.14 will lead to incorrect rejection of H A 1. This means that the difference between y 1 and y 2 is < than 2.14 standard errors (pooled) 2. The probability that this will happen is dependent on n and the true difference between and Results of example What is the conclusion? Difference in Means The unequal variance t-test is based on the Satterthwaite adjustment (of degrees of freedom), it is not recommended unless the variance terms are very different and the sample sizes (n) are very different Difference in Means
11 70 Female 70 Male Annual Income (mean +- SE) Female Male SEX Paired t tests: The logic of 1. Often there is interest in comparisons of observations that can be considered paired within a subject or replicate a) For example: i. A comparison of activity level before and after eating in the ii. same individual A comparison of longevity of males vs females,where county is the replicate 2. In such cases there is often benefit in accounting for variance that could be caused by differences among subjects (or replicates)
12 Paired observations: Paired t- test H 0 : d = 0 where d is difference between between paired observations t d s d d s d n d Where s d = standard deviation of the sample of differences, and df = n - 1 where n is number of pairs Paired t-test example II Pisaster comes in two colors along the west coast: purple and orange: H o : density of purple per site = density of orange Individual reefs are the replicates of interest Looks like a no brainer Density Orange Purple COLOR Sea star colors all sites two sample
13 Results of a 2 sample test Standard GROUP N Mean Deviation Orange Purple Pooled Variance Difference in Means : % Confidence Interval : to t : df : p-value : Marginally significant WHY? NUMBER Density (95% CI) Count Count COLOR Orange Purple Orange Purple Color of seastars Consider the variability added at the level of replicate (site) Given that observations are paired at the level of site can this be accounted for Density 600 Density 600 Density Orange Purple COLOR Govpt Boat Stair Shell Beach Site Hazards Cayucos PSN Govpt Boat Stair Shell Beach SITE Hazards Cayucos PSN COLOR Orange Purple
14 Paired test: Details of calculation 1200 Site Purple Orange difference Govpt Boat Stair PSN Cayucos Hazards Shell Beach mean Sediff t Value ORANGE Index of Case PURPLE Note slopes are they the same: Perhaps rates are a better comparison 1) Convert to rates or 2) Log transform Paired test: Details of calculation: use of Log transformed data Site Purple(log) Orange(log) difference Govpt Boat Stair PSN Cayucos Hazards Shell Beach Value mean Sediff t LORANGE LPURPLE Index of Case Note slopes much more similar Indicates that: 1) Purples are more common By a constant ratio rather than by a constant amount
15 Review calculations of t for One sample test y s n Two sample test Paired test y 1 y 2 s 1 p s p 1 + n 1 d s d n 2 n d Calculations of Standard Error 1) One sample t-test s n 2 S = SS (n-1) 2) Paired t-test s d n d 2 S d = SS d (n d -1) 3) Two sample t- test (calculation based on pooled variance term) 1 1 s p n 1 n S p = SS 1 +SS 2 (n 1-1)+(n 2-1) = SS 1 +SS 2 (n 1 +n 2-2)
16 Testing statistical null hypotheses Hypothesis construction
17 General Hypothesis A hypothesis that addresses the general question of interest H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface Specific hypotheses A hypothesis that represents the specific question addressed in your study. The specifics include Location of study Time period Replication Simple description of design
18 Specific Hypothesis H o : There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B H A : There will be a difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B Note much of this can be placed in the methods section, which would alleviate the need to state these details. However, also note that the hypotheses above are actually what are being tested Depiction of hypotheses H o : There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect Horizontal Density Vertical Density of Urchins
19 Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect Horizontal Density Vertical Density of Urchins Depiction of hypotheses: what should the units be? Goal To use same units for all assessments irrespective of species or system To have same set of probabilities based on those units Hence - units should link to estimate of confidence Most common form are t-values, which provide an estimate of the difference in mean values calibrated by an estimate of error in the assessment of the mean values
20 T- statistic T X X SE 1 2 (Standard error) SE and SD SD N N i (Standard deviation) (Number of replicates) X X i 2 N X SD SE Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect T = Horizontal Density Vertical Density of Urchins SE
21 Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect T = Horizontal Density Vertical Density of Urchins SE T-distribution (central t) is a null probability distribution Depicts the probability that the null hypothesis is correct One use is to estimate confidence levels
22 Depiction of hypotheses: H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect T = Horizontal Density Vertical Density of Urchins SE Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect T = Horizontal Density Vertical Density of Urchins SE
23 H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces T = Horizontal Density Vertical Density of Urchins SE H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces T = Horizontal Density Vertical Density of Urchins SE
24 H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces Including error yields a confidence interval e.g. 95% confident that the true t value is between. 95% CI T = Horizontal Density Vertical Density of Urchins SE H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 2.5% 95% CI 2.5% T = Horizontal Density Vertical Density of Urchins SE
25 The importance of directionality of the alternative hypothesis (H A ) Consider: H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surfaces vs H o1 : Urchin density on horizontal surfaces will be greater than or equal to that on vertical surfaces H A1 : Urchins will be more dense on vertical than on horizontal surfaces H o1 : Urchin density on horizontal surfaces will be greater than or equal to that on vertical surfaces 100% CI 5% 95% CI T = Horizontal Density Vertical Density of Urchins SE
26 H A1 : Urchins will be more dense on vertical than on horizontal surfaces 100% CI 5% 95% CI T = Horizontal Density Vertical Density of Urchins SE One vs two tailed hypotheses- 1. Which is more interesting? 2. Which is more informed? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% T = Horizontal Density Vertical Density of Urchins SE
27 One vs two tailed hypotheses- 1. Which is more powerful? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% T = Horizontal Density Vertical Density of Urchins SE Example T Replication on horizontal and vertical surfaces = 50 (100 total) Mean on Horizontal surfaces = Mean on Vertical Surfaces = Pooled standard deviation = X h X v T SE
28 One vs two tailed hypotheses- 1. Which is more powerful? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% T= -1.79, p=0.04 T= -1.79, p=0.08 T = Horizontal Density Vertical Density of Urchins SE One vs two tailed hypotheses -Conversion to original units H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% Difference = , p=0.04 Difference = , p=0.08 Horizontal Density Vertical Density of Urchins
29 This is the difference between 1 and 2 tailed hypotheses make sure you know which you are dealing with Always strive for one tailed hypotheses Is there a directional prediction (eg > or separately <) One tailed If not Two tailed Assumptions of t test The t test is a parametric test The t statistic only follows t distribution if: variable has normal distribution (normality assumption) two groups have equal population variances (homogeneity of variance assumption) observations are independent or specifically paired (independence assumption)
30 Normality assumption Data in each group are normally distributed Checks: Frequency distributions be careful Boxplots Probability plots formal tests for normality Solutions: Transformations Don t worry run it anyway just kidding but not entirely Homogeneity of variance Population variances equal in 2 groups Checks: subjective comparison of sample variances boxplots F-ratio test of H 0 : 12 = 2 2 Solutions Transformations Don t worry run it anyway just kidding again but again not entirely
31 F-test on variances H 0 : 12 = 2 2 F statistic (F-ratio) = ratio of 2 sample variances F = s 12 / s 2 2 Reject H 0 if F < or > 1 If H 0 is true, F-ratio follows F distribution Usual logic of statistical test Boxplot Median 25% of values 25% of values Smallest value Largest value LENGTH
32 Count Limpet numbers per quadrat 1. IDEAL 2. SKEWED 3. OUTLIERS 4. UNEQUAL VARIANCES * * * * *
33 Use of transformations to control departures from normality and homogeneity of variances assumptions Pop_ Pop_1990 Variance Pop_1990 Lpop1990 Europe Islamic Newworld Greatest ratio POP_ LPOP Europe Islamic GROUP NewWorld -1 Europe Islamic GROUP NewWorld Ourworld Nonparametric tests Usually based on ranks of the data H 0 : samples come from populations with identical distributions equal means or medians Don t assume particular underlying distribution of data normal distributions not necessary Equal variances and independence still required Typically much less powerful than parametric tests
34 Mann-Whitney-Wilcoxon test Calculates sum of ranks in 2 samples should be similar if H 0 is true Compares rank sum to sampling distribution of rank sums distribution of rank sums when H 0 true Equivalent to t test on data transformed to ranks Additional slides
35 A brief digression to re-sampling theory Number inside Number outside Mean Traditional evaluation would probably involve a t test: another approach is re-sampling. Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment
36 Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment 3) Calculate mean for each group 7.6
37 Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment 3) Calculate mean for each group 4) Repeat many times 5) Calculate differences between pairs of means (remember the null hypothesis is that there is no effect of treatment). This generates a distribution of differences. Mean 1 Mean 2 Difference Number of Observations Distribution of differences observations Difference in Means OK, now what? 0.2 Proportion 0.1 per Bar 0.0
38 Compare distribution of differences to real difference Number inside Number outside Mean Real difference = 4.2 Estimate likelihood that real difference comes from two similar distributions Proportion of differences less than Mean 1 Mean 2 Difference current And on through 1000 differences Likelihood is that distributions are the same What are constraints of this sort of approach?
39 T-test vs resampling Test P-value Resampling T-test Why the difference? Additional examples
40 Worked example Fecundity of predatory gastropods: sample of 37 and 42 egg capsule of Lepsiella from littorinid zone and mussel zone respectively Counted number of eggs per capsule Null hypothesis: no difference between zones in mean number of eggs per capsule Ward & Quinn (1988), qk2002 Box 3.1 Specify H 0 and choose test statistic: H 0 : M = L, i.e. population mean number of eggs per capsule from both zones are equal The t statistic is appropriate test statistic for comparing 2 population means
41 Specify a priori significance (probability) level (): By convention, use = 0.05 (5%). Collect data, check assumptions, calculate test statistic from sample data: Mean SD n Littorinid: Mussel: t = -5.39, df = 77
42 Compare value of t statistic to its sampling distribution, the probability distribution of statistic (for specific df) when H 0 is true what is probability of obtaining t value of 5.39 or greater from a t distribution with 77 df? what is probability of taking samples with observed or greater mean difference from 2 populations with same means? Probability (from JMP) P = Look up in t table P < 0.05
43 If probability of obtaining this value or larger is less than, conclude H 0 is unlikely to be true and reject it: statistically significant result Our probability (<0.001) is less than 0.05 so reject H 0 : statistically significant result. If probability of obtaining this value or larger is greater than, conclude that H 0 is likely to be true and do not reject it: statistically non-significant result
44 Presenting results of t test Methods: An independent t test was used to compare the mean number of eggs per capsule from the two zones. Assumptions were checked with. Results: The mean number of eggs per capsule from the mussel zone was significantly greater than that from the littorinid zone (t = 5.39, df = 77, P < 0.001; see Fig. 2).
Analysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationAnalytical Graphing. lets start with the best graph ever made
Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationAnalytical Graphing. lets start with the best graph ever made
Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationRigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis
Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationWilcoxon Test and Calculating Sample Sizes
Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When
More informationRigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis
/3/26 Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationRigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis
/9/27 Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy
More informationThis is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!
Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal
More informationWorksheet 2 - Basic statistics
Worksheet 2 - Basic statistics Basic statistics references Fowler et al. (1998) -Chpts 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, & 16 (16.1, 16.2, 16.3, 16.9,16.11-16.14) Holmes et al. (2006) - Chpt 4 & Sections
More informationChapter 7 Comparison of two independent samples
Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationOne-Sample and Two-Sample Means Tests
One-Sample and Two-Sample Means Tests 1 Sample t Test The 1 sample t test allows us to determine whether the mean of a sample data set is different than a known value. Used when the population variance
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationRelating Graph to Matlab
There are two related course documents on the web Probability and Statistics Review -should be read by people without statistics background and it is helpful as a review for those with prior statistics
More information4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures
Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures
More informationFrequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=
A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency
More informationStatistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data
Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data 999 Prentice-Hall, Inc. Chap. 9 - Chapter Topics Comparing Two Independent Samples: Z Test for the Difference
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review In the previous lecture we considered the following tests: The independent
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationThe t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies
The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationBasic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).
Basic Statistics There are three types of error: 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation). 2. Systematic error - always too high or too low
More informationBasics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample
More informationNon-parametric methods
Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationTwo-Sample Inferential Statistics
The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females
More informationComparison of Two Population Means
Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:
More informationPermutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods
Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of
More informationResampling Methods. Lukas Meier
Resampling Methods Lukas Meier 20.01.2014 Introduction: Example Hail prevention (early 80s) Is a vaccination of clouds really reducing total energy? Data: Hail energy for n clouds (via radar image) Y i
More informationChapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.
Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:
More informationIntroduction to Statistics with GraphPad Prism 7
Introduction to Statistics with GraphPad Prism 7 Outline of the course Power analysis with G*Power Basic structure of a GraphPad Prism project Analysis of qualitative data Chi-square test Analysis of quantitative
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationChapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationHypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true
Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter
More informationNonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationStatistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong
Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More informationNonparametric Statistics
Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)
More informationDo not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13
C h a p t e r 13 Independent-Samples t Test and Mann- Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In
More informationIntroduction to Statistical Data Analysis III
Introduction to Statistical Data Analysis III JULY 2011 Afsaneh Yazdani Preface Major branches of Statistics: - Descriptive Statistics - Inferential Statistics Preface What is Inferential Statistics? The
More informationName: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm
Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with
More informationLecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t
Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies
More informationPolitical Science 236 Hypothesis Testing: Review and Bootstrapping
Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The
More informationCIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8
CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval
More informationGlossary for the Triola Statistics Series
Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling
More informationNon-Parametric Two-Sample Analysis: The Mann-Whitney U Test
Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, non-parametric tests can
More informationBasics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations
Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent
More informationSEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics
SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More information9 One-Way Analysis of Variance
9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of
More informationLast two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling
More informationChapter 7 Class Notes Comparison of Two Independent Samples
Chapter 7 Class Notes Comparison of Two Independent Samples In this chapter, we ll compare means from two independently sampled groups using HTs (hypothesis tests). As noted in Chapter 6, there are two
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationRama Nada. -Ensherah Mokheemer. 1 P a g e
- 9 - Rama Nada -Ensherah Mokheemer - 1 P a g e Quick revision: Remember from the last lecture that chi square is an example of nonparametric test, other examples include Kruskal Wallis, Mann Whitney and
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers
More informationAn inferential procedure to use sample data to understand a population Procedures
Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationBio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems
Bio 183 Statistics in Research A. Research designs B. Cleaning up your data: getting rid of problems C. Basic descriptive statistics D. What test should you use? What is science?: Science is a way of knowing.(anon.?)
More informationBusiness Statistics. Lecture 5: Confidence Intervals
Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean
More informationThe t-statistic. Student s t Test
The t-statistic 1 Student s t Test When the population standard deviation is not known, you cannot use a z score hypothesis test Use Student s t test instead Student s t, or t test is, conceptually, very
More informationBusiness Statistics MEDIAN: NON- PARAMETRIC TESTS
Business Statistics MEDIAN: NON- PARAMETRIC TESTS CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question HYPOTHESES ON THE MEDIAN The median is a central value
More informationChapter 24. Comparing Means
Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More information2. RELATIONSHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE
7/09/06. RELATIONHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE Design and Data Analysis in Psychology II usana anduvete Chaves alvadorchacón Moscoso. INTRODUCTION You may examine gender differences
More informationWELCOME! Lecture 13 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y X 1 1 One dependent variable
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationChapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.
Chapter 23 Inferences About Means Sampling Distributions of Means Now that we know how to create confidence intervals and test hypotheses about proportions, we do the same for means. Just as we did before,
More informationAgonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?
Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationLast week: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling
More informationSampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =
2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More information+ Specify 1 tail / 2 tail
Week 2: Null hypothesis Aeroplane seat designer wonders how wide to make the plane seats. He assumes population average hip size μ = 43.2cm Sample size n = 50 Question : Is the assumption μ = 43.2cm reasonable?
More informationIntroduction to Analysis of Variance (ANOVA) Part 2
Introduction to Analysis of Variance (ANOVA) Part 2 Single factor Serpulid recruitment and biofilms Effect of biofilm type on number of recruiting serpulid worms in Port Phillip Bay Response variable:
More informationHYPOTHESIS TESTING. Hypothesis Testing
MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationThe Difference in Proportions Test
Overview The Difference in Proportions Test Dr Tom Ilvento Department of Food and Resource Economics A Difference of Proportions test is based on large sample only Same strategy as for the mean We calculate
More information