Outline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II

Size: px
Start display at page:

Download "Outline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II"

Transcription

1 Week 13 Comparing Two Populations, Part II

2 Week 13 Objectives Coverage of the topic of comparing two population continues with new procedures and a new sampling design. The week concludes with a lab session. In particular: 1 The rank-sum test, is presented. 2 The concept of paired data is introduced, and the paired-data T -test, the signed-rank test, and McNemar s test are described. 3 Levene s test and the F-test for comparing two variances are presented. 4 The lab session demonstrates the R implementation of test procedures for one-sample, two-samples, and regression including checking for the validity of assumptions.

3

4 Motivation Outline If the sample sizes are small and the populations non-normal the T test is not valid. The Mann-Whitney-Wilcoxon rank-sum test (or rank-sum test for short), which will be described, can be used with both small and large sample sizes. If the two populations are continuous, the null distribution of the TS is known even with very small sample sizes. For discrete populations, the null distribution of the TS can be well approximated with much smaller sample sizes than those required by contrast-based procedure. The rank-sum test has high power, especially if the two population distributions are heavy tailed, or skewed.

5 The Null Hypothesis and the TS The rank sum procedure tests H F 0 : F 1 = F 2. Let R ij denote the (mid-)rank of observation X ij in the combined set of N = n 1 + n 2 observations, and set n 1 W 1 = R 1j, R 1 = W 1, W 2 = n 1 j=1 n 2 j=1 R 2j, R 2 = W 2 n 2. Then, the Mann-Whitney-Wilcoxon TS is R 1 R 2 = N ( ) N + 1 W 1 n 1, or simply W 1 n 1 n 2 2

6 The Standardized Rank-Sum TS and RR If there are no ties, Z H0 = W 1 n 1 (N + 1)/2 n1 n 2 (N + 1)/12 If H F 0 holds, Z H 0 N(0, 1), for n 1, n 2 > 8. The RR are: H a µ 1 µ 2 > 0 Z H0 z α µ 1 µ 2 < 0 Z H0 z α µ 1 µ 2 0 Z H0 z α/2 Rejection region at level α

7 Example Data on sputum histamine levels from 9 allergic and 13 non-allergic individuals are given in edu/acq/401/data/histamindata.txt. Is there a difference between the two populations? Test at α =.01. Solution. Here R 11 = 18, R 12 = 11, R 13 = 22, R 14 = 19, R 15 = 17, R 16 = 21, R 17 = 7, R 18 = 20, R 19 = 16. Thus W = j R 1j = 151 and Z H0 = 151 9(23)/2 9(13)(23)/12 = Since n 1, n 2 > 8, p-value=2[1 Φ(3.17)] = Thus the difference is significant.

8 Effect of Outliers Outline In the above example, the t test does not reject at level With data in data frame hi, gives p-value of 0.13, and t.test(hi$level hi$sample) t.test(hi$level hi$sample, var.equal=true) gives p-value of In general, using a procedure when the underlying assumptions are violated will give misleading results.

9 Introduction, Motivation Paired data arise when each experimental unit receives each of the two treatments that being compared. 1 Compare the durability of two types of tires. 2 Compare two labs for the analysis of mercury content. 3 Two acne treatments, two cataract treatments, etc. Paired data are of the form: (X 11, X 21 ),..., (X 1n, X 2n ). CIs and the TS are again based on X 1 X 2. But now they are not independent. Thus, previous formulas do not apply. For example, σ 2 X 1 X 2 = σ 2 X 1 + σ 2 X 2 2Cov(X 1, X 2 ). Similarly, the rank sum test is not valid now.

10 The paired data T-test While Cov(X 1, X 2 ) can be estimated, it is easier to use D 1 = X 11 X 21,..., D n = X 1n X 2n D 1,..., D n are independent, and D = X 1 X 2. Thus, σ 2 = σ 2 X 1 X 2 D can be estimated by σ2 = D S2 D /n, where [ n ] SD 2 = 1 Di 2 1 n n 1 n ( D i ) 2 i=1 CIs and testing are based on the fact: i=1 D µ D S D / n T n 1 if normality holds, or if n 30

11 Example A total of 12 water samples are analyzed for mercury content by labs A and B. The paired data yields D = X 1 X 2 = and S D = Does lab B give, on average, higher concentration results than lab A? Test at α = Solution. Here H 0 : µ 1 µ 2 = 0, H a : µ 1 µ 2 < 0. Because n < 30, we must assume normality. Doing so we have: T H0 = D S D / n = / 12 = Since T H0 < t.05,11 = 1.796, H 0 is rejected.

12 It is important to be able to recognize paired data. For example, A study was conducted to see whether two cars, A and B, having very different wheel bases and turning radii, took the same time to parallel park. 7 drivers were randomly obtained and the time required for each of them to parallel park each of the 2 cars was measured. The results are as follows: Driver Car A B

13 The Signed-Rank test 1 Rank the absolute differences D 1,..., D n from smallest to largest. Let R i denote the rank of D i. 2 Assign to R i the sign of D i, forming thus signed ranks. 3 Let S + be the sum of the ranks R i with positive sign, i.e. the sum of the positive signed ranks. If H 0 holds, µ S+ = n(n+1) 4, σs 2 + = n(n+1)(2n+1) 24. If H 0 holds, and n > 10, S + N(µ S+, σs 2 + ). The TS for testing H 0 : µ D = 0 is ( ) n(n + 1) n(n + 1)(2n + 1) Z H0 = S + /, 4 24 The RRs are the usual RRs of a Z -test.

14 Example (Mercury concentrations from Labs A and B) The 12 differences, D i and the ranks of their absolute values are given in the table below. Test H 0 : µ 1 µ 2 = 0, H a : µ 1 µ 2 < 0 at α = D i R i D i R i Solution: Here S + = = 13. Thus Z H0 = = 2.04, with p-value= Φ( 2.04) = Setting the differences in the object d, e.g., d=c( , , , , , , , , , , , ), the command wilcox.test(d,alternative= less ) returns a p-value of

15 Two Proportions with paired data Here each pair (X 1j, X 2j ) can be either (1, 1) or (1, 0) or (0, 1) or (0, 0). As an example, if n voters are asked, both before and after a presidential speech, whether or not they support a certain policy, X 1j = 1 or 0 if the jth voter supports or not before the speech, and X 2j = 1 or 0 if the same voter supports or not after the speech. Typically, however, the pairs (X 1j, X 2j ) are not given. Instead the data are presented in the following table format.

16 After 1 0 Before 1 Y 1 Y 2 0 Y 3 Y 4 Y 1 is the number of (1, 1) pairs, Y 2 is the number of (1, 0) pairs, Y 3 is the number of (0, 1) pairs, Y 4 is the number of (0, 0) pairs, Y Y 4 = n

17 A variation of the T statistic, used only for testing H 0 : p 1 p 2 = 0, is the McNemar test statistic: MN = Y 2 Y 3 Y2 + Y 3 This is referred to N(0,1), so the RR for H a : p 1 > p 2 is MN > z α. Similarly for the other H a. R uses the square of MN and refers it to a χ 2 1 distribution. In this form only H a : p 1 p 2 can be tested with p-value 1-pchisq(MN 2, 1).

18 Example (McNemar s test) Data on approval of the President s performance in office in two surveys, one month apart, of 1600 voting-age Americans, give Y 1 = 794, Y 2 = 150, Y 3 = 86, Y 4 = 570. Is there evidence, at α = 0.05, of a shift in public opinion? Report the p-value. Solution. Here, MN = (150 86)/ = Since z = 1.96 we conclude that there is evidence of a shift in public opinion. The R command 2*(1-pnorm(4.166)) returns a p-value of 3.10e-05.

19 Levene s Test Outline It is based on the idea that if the variances are equal, V 1j = X 1j X 1, j = 1,..., n 1, and V 2j = X 2j X 2, j = 1,..., n 2, where X i, i = 1, 2 is the median from ith sample, correspond to populations with equal means and variances. Thus, equality of variances can be tested by testing the hypothesis H 0 : µ V1 = µ V2 vs µ V1 µ V2 using the two-sample t-test with pooled variance.

20 Example The plasma vitamin C concentration (µmol/l) of five randomly selected smokers and nonsmokers are: Nonsmokers s 1 = Smokers s 2 = Test H 0 : σ1 2 = σ2 2 vs H a : σ1 2 σ2 2 at α = Solution. Here X 1 = 41.68, X 2 = Thus, V 1 values for Nonsmokers V 2 values for Smokers The R commands x=c(0.20,0.03,0.30, 0.00, 0.50); y=c(0.26,0.00,0.17,0.05,0.23); t.test(x, y, var.equal=t) gives a p-value of Thus, H 0 is not rejected.

21 The F Test Under Normality When the two samples have been drawn from normal populations, the exact distribution of S1 2/S2 2 is a multiple of an F distribution. Theorem Let X 11,..., X 1n1 be a random sample from a normal distribution with variance σ1 2, let X 21,..., X 2n2 be another sample from a normal distribution with variance σ2 2, and let S2 1 and S2 2 denote the two sample variances. Then the rv F = S2 1 /σ2 1 S 2 2 /σ2 2 has an F distribution with ν 1 = n 1 1 and ν 2 = n 2 1 degrees of freedom.

22 The test statistic for H 0 : σ 2 1 = σ2 2 is: F H0 = S2 1 S2 2. If the ratio differs sufficiently from 1, the null hypothesis is rejected. In particular the RRs for testing H 0 : σ 2 1 = σ2 2 are H a RR at level α σ1 2 > σ2 2 F H0 > F n1 1,n 2 1;α σ1 2 < σ2 2 F 1 H 0 > F n2 1,n 1 1;α σ1 2 σ2 2 either F H0 > F n1 1,n 2 1;α/2 or F 1 H 0 > F n2 1,n 1 1;α/2

23 Example Consider the data in the previous example, and assume the underlying populations are normal. The test statistic is F H0 = = By the formula for the p-value in p. 333, the p-value, found with 2(1-pf(2.4, 3,3)) is 0.49.

24

25 The R commands Outline If y and x contain the values of the response and the predictor, the basic commands for testing in regression are: out=lm(y x); summary(out); summary(aov(out)) summary(out) gives the estimated regression coefficients and their standard errors, the p-values for testing that each coefficient is zero, R 2, and also the F-test statistic and p-value for the model utility test. summary(aov(out)) gives the ANOVA table.

26 Illustration with Simulated Data e=rnorm(50,0,5); x=runif(50,0,10); y=25-3.4*x+e; out=lm(y x) For the data generated the summary(out) output includes Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 x <2e-16 Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: < 2.2e-16

27 Moreover, the summary(aov(out)) output includes Df Sum Sq Mean Sq F value Pr(>F) x <2e-16 Residuals The standard errors of the coefficients in the summary(out) output can be used for computing T statistics for other hypotheses regarding them. For example, for the T statistic for testing H 0 : β 1 = 3.4 vs H a : β is T H0 = = with corresponding p-value 2(1 G 48 (1.253)) = qqnorm(resid(out)); qqline(resid(out), col=2) can be used to check the normality assumption

28

29 T-tests and T-intervals for one mean Let x contain the data set. By default, the command t.test(x), which is equivalent to t.test(x, mu=0, alternative= two.sided, conf.level=0.95) gives the t-statistic, the df, the p-value for testing H 0 : µ = 0 against the two-sided alternative, the 95% CI for µ, and X. To test H 0 : µ = 8.5, replace mu=0 by mu=8.5. For one-sided alternatives, use alternative = less and alternative = greater. Note, however, the CIs are now one-sided.

30 Example Is there evidence that the average level of radiation is higher than the federal health standard of 10 W/cm 2? Use the data in ExRadiationTestData.txt to test at α = Also, report the p value, and construct a 95% CI. Solution. Reading the data set into the R object x, the command t.test(x, mu=10, alternative= greater ) returns a p-value of Thus, H 0 : µ = 10 cannot be rejected in favor of H a : µ > 10 at α = Next use t.test(x, mu=10, alternative= two.sided ) to get a 95% CI of (9.773, ).

31 Power and sample size calculations for H 0 : µ = µ 0 First one needs to install the package pwr using the command install.packages( pwr ) Then issue the command library(pwr) to load the package in the current R session. The command for computing the power at a given µ a with a given n, α and S value, for H a : µ > µ 0, is pwr.t.test(n, (µ a µ 0 )/S, α, power=null, one.sample, greater ) For H a : µ < µ 0 and H a : µ µ 0 replace greater by less and two.sided, respectively.

32 Example For the testing problem H 0 : µ = 10 vs H a : µ > 10 with the ExRadiationTestData.txt data set, find the power at µ a = 11. Solution. The commands length(x); sd(x) return n = 25 and S = 2.00 for this data set. The R command pwr.t.test(25, (11-10)/2.00, 0.05, power = NULL, one.sample, greater ) returns a power of NOTE: Treating S as the true σ, the command 1-pnorm((10-11)/(2.00/sqrt(25)) + qnorm(0.95)) returns a power of 0.80 according to the formula in the teaching slides.

33 The command for computing the sample size needed to achieve a certain level of power at µ a with a given α and S value, for H a : µ > µ 0, is pwr.t.test(n=null, (µ a µ 0 )/S, α, power(µ a ), one.sample, greater ) For H a : µ < µ 0 and H a : µ µ 0 replace greater by less and two.sided, respectively.

34 Example For the testing problem H 0 : µ = 10 vs H a : µ > 10 with the ExRadiationTestData.txt data set, find the sample size needed to achieve power of 0.9 at µ a = 11. Solution. The R command pwr.t.test(n=null, (11-10)/2.00, 0.05, 0.9, one.sample, greater ) returns a sample size of 35.65, which is rounded up to 36. NOTE: Treating S as the true σ, the command (2.00*(qnorm(.95)+qnorm(.9))/(10-11))**2 returns a sample size of 34.26, which is rounded to 35, according to the formula in the teaching slides.

35 Two independent samples The t.test command can also be used for comparing two means, both with independent and with paired data. The two samples can be in two separate columns (i.e., x and y), or combined in one column, say y, with a separate column, say x, indicating the sample membership of each observation. The default is to treat the two samples as independent, do 95% CI, and give the p-value for H a : µ 1 µ 2 0, without assuming σ 1 = σ 2. The command with these default options is: t.test(x, y) # One sample in x, the other in y t.test(y x) # For values in y and sample index in x

36 For the pooled variance T test, and 99% CI do: t.test(y x, var.equal = TRUE, conf.level = 0.99) and similarly if the two samples are in separate columns. To test a different null hypothesis, e.g., H 0 : µ 1 µ 2 = 1.8 vs H a : µ 1 µ 2 < 1.8 do: t.test(y x, mu=1.8, alternative = less ). and similarly if the two samples are in separate columns. Other options are: alternative = greater, or the default two.sided.

37 Example Use the R data set airquality to compare the ozone levels in May and August. Report the p-value, test at 0.05, and construct a 95%CI for µ 1 µ 2, with and without the assumption of equal variances. [NOTE: Normality is violated; check with boxplot(ozone Month, data = airquality). ] Solution: Use: y1=airquality$ozone; x1=airquality$month x=y1[which(x1==5)]; y=y1[which(x1==8)]; t.test(x, y); t.test(x, y, var.equal = T) More advanced application ( ) : t.test(ozone Month, data = airquality, subset = Month %in% c(5, 8))

38 Outline The basic command for testing and CI construction with paired data is t.test(y x, paired = T) and similarly if the two samples are in different columns. Other options can be added as before. For example, t.test(y x, alternative = c( two.sided, less, greater ), mu = 1.8, paired = T, conf.level = 0.9) With paired data, equality of the two marginal variances is a non-issue, so you never need to use var.equal=t.

39 Example Two brands of motorcycle tires are to be compared for durability. Eight motorcycles are selected at random and one tire from each brand is randomly assigned (front or back) on each motorcycle. The motorcycles are then run until the tires wear out. The data in motorcycletireslifetimes.txt are in km. Use the paired T -test procedure to test the hypothesis of equal average durability at level α = 0.05, and to construct a 90% CI for µ 1 µ 2. Solution: Read the data in tl and use: x=tl$brand1; y=tl$brand2; t.test(x,y,paired=t, conf.level=0.9) # set x and y and construct the test and CIs

40

41 The Rank Sum Test The wilcox.test command can be used to conduct both the rank-sum test and the signed-rank test. Again, the two samples can be in two separate columns, or combined in one column with a separate column indicating the sample membership of each observation. The default is to treat the two samples as independent, and give the p-value for testing equality of the two populations against the two-sided alternative, without constructing a CI: wilcox.test(x, y) # One sample in x, the other in y wilcox.test(y x) # For values in y and sample index in x

42 To get a CI for the location difference use: wilcox.test(y x, conf.int = TRUE, conf.level = 0.9) [The description of this CI is not in the book.] To test for different null and alternative hypotheses use: wilcox.test(y x, mu=1.8, alternative = c( less, greater )) Similarly if the two samples are in different columns.

43 Example Use the R data set airquality to compare the ozone levels in May and August. [Check data set with boxplot(ozone Month, data = airquality)] Solution: Use y1=airquality$ozone; x1=airquality$month x=y1[which(x1==5)]; y=y1[which(x1==8)]; wilcox.test(x, y, conf.int = T) More advanced application ( ) : wilcox.test(ozone Month, data = airquality, subset = Month %in% c(5, 8))

44 Rank sum for paired data (Signed-Rank Test) The basic command for the signed-rank test with paired data (without constructing a CI) is: wilcox.test(x, y, paired = T) # One sample in x, the other in y wilcox.test(y x, paired = T) # For values in y and sample index in x Other options can be added as before. For example, wilcox.test(x, y, alternative = c( less, greater ), mu = 1.8, paired = T, conf.int = T, conf.level = 0.9)

45 Example Two brands of motorcycle tires are to be compared for durability. Eight motorcycles are selected at random and one tire from each brand is randomly assigned (front or back) on each motorcycle. The motorcycles are then run until the tires wear out. The data in 401/Data/motorcycleTiresLifetimes.txt are in km. Use the signed-rank test procedure to test the hypothesis of equal durability at level α = 0.05, and to construct a 90% CI for the location difference. Solution: Read the data in tl and use: x=tl$brand1; y=tl$brand2; wilcox.test(x, y, paired=t, conf.int = T, conf.level=0.9) # set x and y and construct the test and CIs

46

47 Set the number of successes and the number of trials in x and n. For example, use x=c(16,14); n=c(200,400) if X 1 = 16, X 2 = 14, n 1 = 200, n 2 = 400. To test H 0 : p 1 p 2 = 0 vs the two-sided alternative, and construct a 95% CI for p 1 p 2, use prop.test(x, n), or, equivalently: prop.test(x, n, alternative = two.sided, conf.level = 0.95) Other alternative options are less, or greater. No option for testing other null hypotheses, e.g., H 0 : p 1 p 2 = 0.1

48 Example An article in Knee Surgery, Sports Traumatology, Arthroscopy (2005), Vol. 13, , reported results of arthroscopic meniscal repair with an absorbable screw. For tears greater than 25 millimeters, 10 of 18 repairs were successful, while for tears less than 25 millimeters, 22 of 30 were successful. Is there evidence that the success rate for the two types of tears are different? Test at α = 0.1, report the p-value, and construct a 90% confidence interval for p 1 p 2. Solution: Use x=c(10,22); n=c(18,30); prop.test(x, n, conf.level = 0.9)

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

= 1 i. normal approximation to χ 2 df > df

= 1 i. normal approximation to χ 2 df > df χ tests 1) 1 categorical variable χ test for goodness-of-fit ) categorical variables χ test for independence (association, contingency) 3) categorical variables McNemar's test for change χ df k (O i 1

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

Comparison of two samples

Comparison of two samples Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer Solutions to Exam in 02402 December 2012 Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer 3 1 5 2 5 2 3 5 1 3 Exercise IV.2 IV.3 IV.4 V.1

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Comparing Two Variances. CI For Variance Ratio

Comparing Two Variances. CI For Variance Ratio STAT 503 Two Sample Inferences Comparing Two Variances Assume independent normal populations. Slide For Σ χ ν and Σ χ ν independent the ration Σ /ν Σ /ν follows an F-distribution with degrees of freedom

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

The Statistical Sleuth in R: Chapter 5

The Statistical Sleuth in R: Chapter 5 The Statistical Sleuth in R: Chapter 5 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton January 21, 2013 Contents 1 Introduction 1 2 Diet and lifespan 2 2.1 Summary statistics and graphical display........................

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure

More information

Resampling Methods. Lukas Meier

Resampling Methods. Lukas Meier Resampling Methods Lukas Meier 20.01.2014 Introduction: Example Hail prevention (early 80s) Is a vaccination of clouds really reducing total energy? Data: Hail energy for n clouds (via radar image) Y i

More information

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data 999 Prentice-Hall, Inc. Chap. 9 - Chapter Topics Comparing Two Independent Samples: Z Test for the Difference

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Comparison of Two Population Means

Comparison of Two Population Means Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Tentative solutions TMA4255 Applied Statistics 16 May, 2015 Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 2 Two-Sample Methods 3 2.1 Classic Method...................... 7 2.2 A Two-sample Permutation Test............. 11 2.2.1 Permutation test................. 11 2.2.2 Steps for a two-sample

More information

Relating Graph to Matlab

Relating Graph to Matlab There are two related course documents on the web Probability and Statistics Review -should be read by people without statistics background and it is helpful as a review for those with prior statistics

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

The Statistical Sleuth in R: Chapter 5

The Statistical Sleuth in R: Chapter 5 The Statistical Sleuth in R: Chapter 5 Kate Aloisio Ruobing Zhang Nicholas J. Horton June 15, 2016 Contents 1 Introduction 1 2 Diet and lifespan 2 2.1 Summary statistics and graphical display........................

More information

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6 R in Linguistic Analysis Wassink 2012 University of Washington Week 6 Overview R for phoneticians and lab phonologists Johnson 3 Reading Qs Equivalence of means (t-tests) Multiple Regression Principal

More information

Wilcoxon Test and Calculating Sample Sizes

Wilcoxon Test and Calculating Sample Sizes Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When

More information

R Short Course Session 4

R Short Course Session 4 R Short Course Session 4 Daniel Zhao, PhD Sixia Chen, PhD Department of Biostatistics and Epidemiology College of Public Health, OUHSC 11/13/2015 Outline Random distributions Summary statistics Statistical

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

The independent-means t-test:

The independent-means t-test: The independent-means t-test: Answers the question: is there a "real" difference between the two conditions in my experiment? Or is the difference due to chance? Previous lecture: (a) Dependent-means t-test:

More information

Regression and the 2-Sample t

Regression and the 2-Sample t Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

Analysis of Variance Bios 662

Analysis of Variance Bios 662 Analysis of Variance Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-21 13:34 BIOS 662 1 ANOVA Outline Introduction Alternative models SS decomposition

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data 1999 Prentice-Hall, Inc. Chap. 10-1 Chapter Topics The Completely Randomized Model: One-Factor

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 1 1-1 Basic Business Statistics 11 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Basic Business Statistics, 11e 009 Prentice-Hall, Inc. Chap 1-1 Learning Objectives In this chapter,

More information

Two sample Hypothesis tests in R.

Two sample Hypothesis tests in R. Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Chapter 10: Inferences based on two samples

Chapter 10: Inferences based on two samples November 16 th, 2017 Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 1: Descriptive statistics Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Analysis of 2x2 Cross-Over Designs using T-Tests

Analysis of 2x2 Cross-Over Designs using T-Tests Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that Math 47 Homework Assignment 4 Problem 411 Let X 1, X,, X n, X n+1 be a random sample of size n + 1, n > 1, from a distribution that is N(µ, σ ) Let X = n i=1 X i/n and S = n i=1 (X i X) /(n 1) Find the

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Nonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006

Nonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006 Nonparametric Tests Mathematics 47: Lecture 25 Dan Sloughter Furman University April 20, 2006 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 1 / 14 The sign test Suppose X 1, X 2,...,

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

Inference with Heteroskedasticity

Inference with Heteroskedasticity Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38 BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, non-parametric tests can

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Statistics for EES Factorial analysis of variance

Statistics for EES Factorial analysis of variance Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity

More information

TMA4255 Applied Statistics V2016 (23)

TMA4255 Applied Statistics V2016 (23) TMA4255 Applied Statistics V2016 (23) Part 7: Nonparametric tests Signed-Rank test [16.2] Wilcoxon Rank-sum test [16.3] Anna Marie Holand April 19, 2016, wiki.math.ntnu.no/tma4255/2016v/start 2 Outline

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Multiple Regression Introduction to Statistics Using R (Psychology 9041B)

Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

10.4 Hypothesis Testing: Two Independent Samples Proportion

10.4 Hypothesis Testing: Two Independent Samples Proportion 10.4 Hypothesis Testing: Two Independent Samples Proportion Example 3: Smoking cigarettes has been known to cause cancer and other ailments. One politician believes that a higher tax should be imposed

More information

Week 12 Hypothesis Testing, Part II Comparing Two Populations

Week 12 Hypothesis Testing, Part II Comparing Two Populations Week 12 Hypothesis Testing, Part II Week 12 Hypothesis Testing, Part II Week 12 Objectives 1 The principle of Analysis of Variance is introduced and used to derive the F-test for testing the model utility

More information

Introductory Statistics with R: Simple Inferences for continuous data

Introductory Statistics with R: Simple Inferences for continuous data Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail: sungkyu@pitt.edu

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Fritz Scholz Your returned assignment should show your name and student ID number. It should be printed or written clearly. 1. The data set ReactionTime

More information