8 Comparing Two Independent Populations
|
|
- Joy Mary Bishop
- 6 years ago
- Views:
Transcription
1 8 Comparing Two Independent Populations We ll study these methods for comparing two independent populations:. The Two-Sample T-Test (Normal with Equal Variances) 2. The Welch T-Test (Normal) 3. Bootstrap for Two Samples 4. The Wilcoxon Rank Sum or Mann-Whitney Test 5. Comparing Two Population Proportions 8. The Two-Sample T-Test (Normal with Equal but Unknown Variances) e.g. The horned lizard has spikes on its head that may protect against its primary predator, the loggerhead shrike. Researchers wanted to compare dead lizards killed by shrikes with live lizards from the same area. A SRS was taken from each population. The longest spike was measured, in mm. Is there a difference in longest spike length across the two populations? Here are the data: Dead: 7.65, 20.83, 24.59, 8.52, 2.40, 23.78, 20.36, 8.83, 2.83, Live: 23.76, 2.7, 26.3, 20.8, 23.0, 24.84, 9.34, 24.94, 27.4, 25.87, 8.95, 22.6 Start with graphical and numerical summaries: Longest Spike Length (mm): Dead Density Live Group n x s Dead Live Density The summaries show. Is it, or just the result of? The shift of sample means matters only of the sample data.
2 Compare two population means, µ and µ 2, by studying their. Notation: Populatio Populatio Variable X X 2 Mean µ Variance σ2 2 Sample size Sample mean X Sample variance s 2 2 For inference about µ µ 2, use the statistic, and then test H 0 : µ µ 2 = δ 0 (δ 0 = 0 = ) find a confidence interval for µ µ 2 To do this, we need the distribution of. Recall, for independent X and Y : E(X Y ) = V AR(X Y ) = E( X) = V AR( X) = If X N(µ X, σx 2 ) and Y N(µ Y, σy 2 ), then X Y For normal X, X It follows that, for normal populations and 2, X X 2 N ( ) µ µ 2, σ2 + σ2 2. But we don t know or. Here we assume they are equal and calculate a variance estimate: s 2 p = n i= (X,i X ) n2 i= (X 2,i X 2 ) 2 = ( )s 2 + ( )s Now we can state a test and a confidence interval: Many hypothesis tests use test statistics of the form (point estimate) (parameter value ) (estimated or true) of point estimate
3 This. point estimate tells how far the estimate is from the parameter, in For our difference of two means, this is T = ( X X 2 ) δ 0 t n +n s ( ) p + s2 p Recall that many confidence intervals have the form (point estimate) ± (margin of error) =(point estimate) ± ( value for confidence) [ of point estimate] =ˆθ ± (table value for confidence) σˆθ Our 00%( α) confidence interval for µ µ 2, assuming normal populations and equal population variances, is ( X X 2 ) ± t (n + 2,α/2) s 2 p + s2 p e.g. Test whether the lizard populations have the same population spike lengths and find a confidence interval for the difference in lengths. Check normality: QQ Plot of Dead QQ Plot of Live Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles Check equal variances from original plots, above, or from a rule of thumb that it s plausible that population variances are equal if the larger sample variance is less than twice the smaller. e.g. Our sample variances are and, so we can.
4 e.g. Calculate a test and interval for the lizard spikes: Summary: Suppose we have independent simple random samples from normal populations with means µ and µ 2 and variances σ 2 and σ2 2, where σ2 = σ2 2. To test H 0 : µ µ 2 = δ 0,. State null and alternative hypotheses, H 0 and H A 2. Check assumptions (rule of thumb: σ 2 = σ2 2 is plausible if s2 and s2 2 within factor of 2) 3. Find the pooled variance estimate s 2 p = ( )s 2 + ( )s Find the test statistic t = ( x x 2 ) δ 0 s 2 p + s2 p 5. Find the p-value, which is an area under the t n + 2 curve depending on H A : H A : µ µ 2 > δ 0 = p-value = P (T > t), the area right of t H A : µ µ 2 < δ 0 = p-value = P (T < t), the area left of t H A : µ µ 2 δ 0 = p-value = P ( T > t ), the sum of the two tail areas 6. Draw a conclusion A 00%( α) confidence interval for µ µ 2 is ( X X 2 ) ± t (n + 2,α/2) s 2 p + s2 p. Note: I recommend Welch s t-test, below, instead of the two-sample t-test, above. We introduced the equal-variances two-sample t-test to see its s 2 pooled, which has the form sum of squared deviations from respective sample means degrees of freedom, which we ll see again in 0 on ANOVA.
5 8.2 The Welch T-Test (Normal without Assuming Equal Variances) e.g. Concrete is often reinforced with steel rebar ( reinforcing bar ). Steel is strong, but tends to corrode over time. An experiment tested two corrosion-resistant materials, one fiberglass and the other carbon. Eight concrete beams with fiberglass reinforcement, and with carbon reinforcement, were poured. Each was subjected to a load test, with the breaking force measured in kn (kilonewtons): Fiberglass: 37.3, 29.6, 33.4, 33.6, 30.7, 32.7, 34.6, 32.3 Carbon: 48.8, 38.0, 42.2, 45., 33.8, 47.2, 50.6, 44.0, 43.9, 40.4, 45.8 Is there a difference in the (population) mean strengths of the two types of beams? We test: H 0 : µ carbon µ fiber = vs. H A : µ carbon µ fiber First, make graphical and numerical summaries. Beam Type Sample Size Mean SD Fiber Carbo Concrete breaking force (kn): fiberglass Density carbon Density These summaries suggest. Let s test this. Is it plausible the two populations are? Here are QQ plots:
6 Normal Q Q Plot Normal Q Q Plot Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles We ll assume populations. The first graph, and our rule of thumb ( ), suggest in the Carbon group, so we assume equal variances. Suppose, then, that we have independent simple random samples from two normal populations with means µ and µ 2 and variances σ 2 and σ2 2, which be equal. Recall (from 8., above) that X X 2 N ( ) µ µ 2, σ2 + σ2 2 We could standardize X X 2 as Z = ( X X 2 ) (µ µ 2 ), but we don t σ 2 + σ2 2 know or. We approximate them with and, and then get a distribution instead of a. (Recall that the t ν distributions look like, but are with ). Experts say T = ( X X 2 ) (µ µ 2 ) s 2 + s2 2 t ν ( ), where ν = ( s 2 + s2 2 ) 2 (s 2 /) 2 + (s2 2 /) 2 Now we can state a test and interval if we recall the common test statistic form,, rounded. ( ) ( ), (estimated or true) of point estimate which tells how far the estimate is from the parameter, in standard deviations, and the common confidence interval form, (point estimate) ± (margin of error) = ˆθ ± (table value for confidence) σˆθ
7 Suppose we have independent simple random samples from normal populations with means µ and µ 2 and variances σ 2 and σ2 2. To test H 0 : µ µ 2 = δ 0,. State null and alternative hypotheses, H 0 and H A 2. Check assumptions 3. Find the test statistic t = ( x x 2 ) δ 0 s 2 + s Find the degrees of freedom, ν = ( s 2 + s2 2 ) 2 (s 2 /) 2 + (s2 2 /) 2, rounded down 5. Find the p-value, which is an area under the t ν curve depending on H A : H A : µ µ 2 > δ 0 = p-value = P (T ν > t), the area right of t H A : µ µ 2 < δ 0 = p-value = P (T ν < t), the area left of t H A : µ µ 2 δ 0 = p-value = P ( T ν > t ), the sum of the two tail areas 6. Draw a conclusion ( X X 2 ) ± t ν,α/2 s 2 + s2 2 contains µ µ 2 for a proportio α of samples. Note that these formulas are like the 8. formulas, except that the estimated and changed. e.g. Test H 0 : µ carbon µ fiber = vs. H A : µ carbon µ fiber. t = ν = p-value = conclusion: 95% interval for µ carbon µ fiber : Compare two-sided test and interval:
8 To decide between the 8. two-sample t-test and this 8.2 Welch s t-test, consider If population variances are equal, but are not assumed to be equal (so Welch s test is used), the test loses a little, but is still a good test. If population variances are different, but are assumed equal (so the two-sample t-test is used), the test can make conclusions. 8.3 Bootstrap for Two Samples For populations that may not be, we can do a bootstrap test or interval for a difference of two means. It uses the Welch s t obs and resamples from the two samples many times, each time finding ˆt = ( x x 2 ) ( x x 2 ) s 2, to estimate the population distribution of t. + s2 2 e.g. When sage crickets mate, the male allows the female to eat part of his hind wings. Does female hunger influence desire to mate? An experiment randomly assigned 24 females to two groups. One group of was starved for two days, while the other group of 3 was fed normally. Each female was presented with a male and the time to mating (in hours) was recorded. Do starved females have a different mean time to mating than normally fed females? Here are the data: Starved:.9, 2., 3.8, 9.0, 9.6, 3.0, 4.7, 7.9, 2.7, 29.0, 72.3 Fed:.5,.7, 2.4, 3.6, 5.7, 22.6, 22.8, 39.0, 54.4, 72., 73.6, 79.5, 88.9 We test: H 0 : µ starved µ fed = vs. H A : µ starved µ fed Start with summaries: Group Sample Size Mean SD Starved Fed (So far, we might consider a, as the variances seem.) Fed Fed Starved Percent of Total 0 Starved Time (hrs) Time (hrs)
9 Note that the fed times, while the starved times include. QQ Plot of Starved QQ Plot of Fed Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles We cannot use the Welch s T -test because. We can again use a method. To do a bootstrap test for H 0 : µ µ 2 = 0,. Draw simple random samples x,, x,2,..., x,n of size from the first population and x 2,, x 2,2,..., x 2,n2 of size from the second. Compute x, s 2, x 2, and s 2 2. Find t obs = ( x x 2 ) 0. s 2 + s Draw simple random samples with replacement, x,,..., x,, from the first sample and x 2,,..., x 2, from the second. 3. Compute the means and variances of the resampled data for each group. Call these x and s 2, and x 2 and s2 2. s 2 4. Compute the statistic ˆt = ( x x 2 ) ( x x 2 ) + s Repeat steps 2-4 a large number, B, times to get a collection of ˆt values that approximate the sampling distribution of t. 6. Find the p-value, an area under the approximate sampling distribution given by, where m depends on H A : H A : µ µ 2 > 0 = m is the number of values of ˆt for which ˆt H A : µ µ 2 0 = m is the number of values of ˆt for which ˆt < t obs H A : µ µ 2 0 = m is the number of values of ˆt for which t obs 7. Draw a conclusion: { p-value α (where α is the level,.05 by default) = reject H0 p-value > α = retain H 0 as plausible
10 # Here's one way to do the bootstrap for a difference of two means in R: starved = c(.9, 2., 3.8, 9.0, 9.6, 3.0, 4.7, 7.9, 2.7, 29.0, 72.3) fed = c(.5,.7, 2.4, 3.6, 5.7, 22.6, 22.8, 39.0, 54.4, 72., 73.6, 79.5, 88.9) summary(starved) # numerical summaries sd(starved) summary(fed) sd(fed) # install.packages("lattice") # Run once to download R code to your computer. require("lattice") all = c(starved, fed) # graphs group = c(rep("starved", ), rep("fed", 3)) dotplot(~all group, layout = c(,2), as.table = T, xlab = 'Time (hrs)') histogram(~all group, layout = c(,2), as.table = T, xlab = "Time (hrs)") qqnorm(starved, main = "QQ Plot of Starved") qqnorm(fed, main = "QQ Plot of Fed") # dat and dat2 are data from the two groups. nboot is the number of resamples. boottwo = function(dat, dat2, nboot) { bootstat = numeric(nboot) truediff = mean(dat) - mean(dat2) n = length(dat) n2 = length(dat2) for(i i:nboot) { samp = sample(dat, size = n, replace = T) samp2 = sample(dat2, size = n2, replace = T) bootmean = mean(samp) bootmean2 = mean(samp2) bootvar = var(samp) bootvar2 = var(samp2) bootstat[i] = ((bootmean - bootmean2) - truediff)/sqrt((bootvar/n) + (bootvar2/n2)) } return(bootstat) } B = 5000 cricketboot = boottwo(starved, fed, B) t.obs = (mean(starved) - mean(fed)) / sqrt(var(starved) / length(starved) low = sum(cricketboot < -abs(t.obs)) high = sum(cricketboot > abs(t.obs)) p.val = (low + high) / B + var(fed) / length(fed))
11 e.g. For the starved/fed cricket data, we find t obs =. I used R to run B = 5000 resamples and found ˆt values less than t obs and greater, for a p-value of. We conclude. 8.4 The Wilcoxon Rank Sum or Mann-Whitney Test One more test of location for two populations that may not be normal is the Wilcoxon Rank Sum Test or Mann-Whitney Test. e.g. Consider again the cricket data: starved:.9, 2., 3.8, 9.0, 9.6, 3.0, 4.7, 7.9, 2.7, 29.0, 72.3 (n starved = ) fed:.5,.7, 2.4, 3.6, 5.7, 22.6, 22.8, 39.0, 54.4, 72., 73.6, 79.5, 88.9 (n fed = ) For the Wilcoxon Rank Sum test, we must assume independence of sample data between and within groups and that the distributions of the two groups. Our hypotheses are in terms of the two population (not ): H 0 : The distributions of the two groups are identical vs. H A : The distributions of the two groups relative to the other. but one is The test statistic is related to of the samples, so we rank the data without regard for sample, while retaining sample labels. Then we find: R = sum of sample ranks, R min = ( +) 2 = minimum possible sum, and U = R R min H A : populatio is shifted left of 2 = p-value = P (U U obs ) p-value: H A : populatio is shifted right of 2 = p-value = P (U U obs ) H A : populatio is shifted from 2 = p-value = 2 min[p (U U obs ), P (U U obs ), 2 ]
12 e.g. Here are the cricket data again: rank time sample starved ranks.5 fed.7 fed 3 starved fed fed starved fed starved starved starved starved starved starved fed fed starved fed fed fed starved fed fed fed R = R min = U = (For observations, use ranks. e.g. If samples had two.2s, they d be # and #2 or #2 and #, so each would get rank.) How many possible rankings must we consider to find the p-value? Here s one way to do this with R: starved = c(.9, 2., 3.8, 9.0, 9.6, 3.0, 4.7, 7.9, 2.7, 29.0, 72.3) fed = c(.5,.7, 2.4, 3.6, 5.7, 22.6, 22.8, 39.0, 54.4, 72., 73.6, 79.5, 88.9) wilcox.test(starved, fed) For the cricket data, R gives p-value, so we.
13 e.g. Here s a simpler example for which it is not hard to calculate the p-value by hand. Suppose sample A is 4.8, 2.2 and sample B is 3.0,.5, 3.5. Sample A s ranks are and, R =, R min =, and U =. Under H 0, ranks are randomly assigned to the two samples from {, 2, 3, 4, 5}. Here are the possible sample A ranks and the statistics we get from them: Sample A ranks, 2, 3, 4, 5 2, 3 2, 4 2, 5 (observed) 3, 4 3, 5 4, 5 R R min = U The p-value is. Conclusion:
14 8.5 Comparing Two Population Proportions e.g. Does handedness differ by sex? A SRS of n M = 54 males and n F = 2 females was taken. Each person indicated his or her dominant hand: Female: 2 left, 9 right Male: 23 left, 3 right Let π F L = proportion of left-handed females and π ML = proportion of left-handed males in the population. We test H 0 : π F L π ML = 0 H A : π F L π ML 0 A natural point estimate for the population difference of proportions is. If π F L n F, ( π F L )n F, π ML n M, and ( π ML )n M,, are all greater than, we can use the CLT to say ˆπ F L ˆπ ML N ( π F L π ML, π F L( π F L ) + π ) ML( π ML ) n F n M But we don t know and. Under H 0, they are : π F L = π ML = π L, and the distribution becomes: ˆπ F L ˆπ ML N ( ( 0, π L ( π L ) + )) n F n M We don t know the common proportion, but we estimate it with a weighted average of the sample proportions: ˆπ L = ˆπ F Ln F + ˆπ ML n M number of in both samples combined = n F + n M combined our point estimate to get a test statistic: Z = ˆπ F L ˆπ ML ( ) ˆπ L ( ˆπ L ) n F + n M N(0, ) e.g. For the handedness data, we have: ˆπ F L = ˆπ ML = ˆπ L =
15 The (approximate) expected numbers of successes and failures are. z = p-value = conclusion: We can also make a CI. It does not come with a, so we use the more general form of the variance. An approximate 00( α)% CI for π F L π ML is: For our data, a 95% interval works out to: ˆπ F L ˆπ ML ± z α/2 ˆπ F L ( ˆπ F L ) n F + ˆπ ML( ˆπ ML ) n M Summary: Suppose X Bin(n X, π X ) and Y Bin(n Y, π Y ) are independent, with n X π X, n X ( π X ), n Y π Y, and n Y ( π Y ) all > 5. To test H 0 : π X π Y = 0:. State null and alternative hypotheses, H 0 and H A 2. Check assumptions 3. Find ˆπ X = X, ˆπ Y = Y, and pooled ˆπ = X + Y n X n Y n X + n Y (ˆπ X ˆπ Y ) 0 4. Find the test statistic, z = ˆπ( ˆπ)(/nX + /n Y ) 5. Find the p-value, which is an area under the N(0, ) curve depending on H : H A : π X π Y > 0 = p-value = P (Z > z), the area right of z H A : π X π Y < 0 = p-value = P (Z < z), the area left of z H A : π X π Y 0 = p-value = P ( Z > z ), the sum of the two tail areas 6. Draw a conclusion A (00%)( α) confidence interval for π X π Y is (ˆπ X ˆπ Y ) ± z α/2 ˆπX ( ˆπ X ) ˆn X + ˆπ Y ( ˆπ Y ) ˆn Y. In the next section, we compare two means when the samples are.
8 Comparing Two Populations via Independent Samples, Part 1/2
8 Comparing Two Populations via Independent Samples, Part /2 We ll study these ways of comparing two populations from independent samples:. The Two-Sample T-Test (Normal with Equal Variances) 2. The Welch
More information7 More One-Sample Confidence Intervals and Tests, Part 1 of 2
7 More One-Sample Confidence Intervals and Tests, Part 1 of 2 We already have a Z confidence interval ( 5) and a Z test ( 6) for an unknown mean µ for when we know σ and have a normal population or large
More informationComparison of Two Population Means
Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have
More informationThe Components of a Statistical Hypothesis Testing Problem
Statistical Inference: Recall from chapter 5 that statistical inference is the use of a subset of a population (the sample) to draw conclusions about the entire population. In chapter 5 we studied one
More informationCBA4 is live in practice mode this week exam mode from Saturday!
Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More information1 Statistical inference for a population mean
1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known
More informationChapter 8 - Statistical intervals for a single sample
Chapter 8 - Statistical intervals for a single sample 8-1 Introduction In statistics, no quantity estimated from data is known for certain. All estimated quantities have probability distributions of their
More informationEpidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval
Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in
More informationAnalysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationInference for Single Proportions and Means T.Scofield
Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter
More informationBIO5312 Biostatistics Lecture 6: Statistical hypothesis testings
BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings Yujin Chung October 4th, 2016 Fall 2016 Yujin Chung Lec6: Statistical hypothesis testings Fall 2016 1/30 Previous Two types of statistical
More informationSTAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.
STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015 Mann-Whitney Test Mann-Whitney Test Recall that the Mann-Whitney
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationChapter 7 Comparison of two independent samples
Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationExam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015
Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationSmoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total
Math 3070. Treibergs Final Exam Name: December 7, 00. In an experiment to see how hypertension is related to smoking habits, the following data was taken on individuals. Test the hypothesis that the proportions
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationDesign of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments
Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationConfidence Intervals for Population Mean
Confidence Intervals for Population Mean Reading: Sections 7.1, 7.2, 7.3 Learning Objectives: Students should be able to: Understand the meaning and purpose of confidence intervals Calculate a confidence
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationConfidence Intervals with σ unknown
STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationHypothesis Testing One Sample Tests
STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationNonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationNonparametric tests, Bootstrapping
Nonparametric tests, Bootstrapping http://www.isrec.isb-sib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationHypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true
Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationMATH Notebook 3 Spring 2018
MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................
More informationChapter 3. Comparing two populations
Chapter 3. Comparing two populations Contents Hypothesis for the difference between two population means: matched pairs Hypothesis for the difference between two population means: independent samples Two
More informationReview: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.
1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationFrequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=
A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency
More informationBusiness Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing
Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Agenda Introduction to Estimation Point estimation Interval estimation Introduction to Hypothesis Testing Concepts en terminology
More informationSTAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)
STAT 515 -- Chapter 9: Two-Sample Problems Paired Differences (Section 9.3) Examples of Paired Differences studies: Similar subjects are paired off and one of two treatments is given to each subject in
More informationInferential Statistics
Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More informationIntroduction to Nonparametric Statistics
Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationMath 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Comparing two population means If two distributions have the same shape and spread,
More informationSTA Module 10 Comparing Two Proportions
STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare
More informationElementary Statistics
Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:
More informationNull Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017
Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0
More informationLecture 15: Inference Based on Two Samples
Lecture 15: Inference Based on Two Samples MSU-STT 351-Sum17B (P. Vellaisamy: STT 351-Sum17B) Probability & Statistics for Engineers 1 / 26 9.1 Z-tests and CI s for (µ 1 µ 2 ) The assumptions: (i) X =
More informationSampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =
2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result
More informationStatistical Inference for Means
Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,
More informationSTAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.
STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More informationStatistics for EES and MEME 5. Rank-sum tests
Statistics for EES and MEME 5. Rank-sum tests Dirk Metzler June 4, 2018 Wilcoxon s rank sum test is also called Mann-Whitney U test References Contents [1] Wilcoxon, F. (1945). Individual comparisons by
More informationThis is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!
Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal
More informationNonparametric Location Tests: k-sample
Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationAnalysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes
Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationChapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.
Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:
More informationAn interval estimator of a parameter θ is of the form θl < θ < θu at a
Chapter 7 of Devore CONFIDENCE INTERVAL ESTIMATORS An interval estimator of a parameter θ is of the form θl < θ < θu at a confidence pr (or a confidence coefficient) of 1 α. When θl =, < θ < θu is called
More informationNon-parametric (Distribution-free) approaches p188 CN
Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14
More informationInferences About Two Proportions
Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationBootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location
Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea
More informationBackground to Statistics
FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient
More information7 Estimation. 7.1 Population and Sample (P.91-92)
7 Estimation MATH1015 Biostatistics Week 7 7.1 Population and Sample (P.91-92) Suppose that we wish to study a particular health problem in Australia, for example, the average serum cholesterol level for
More informationContents 1. Contents
Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample
More informationInferences Based on Two Samples
Chapter 6 Inferences Based on Two Samples Frequently we want to use statistical techniques to compare two populations. For example, one might wish to compare the proportions of families with incomes below
More informationRelating Graph to Matlab
There are two related course documents on the web Probability and Statistics Review -should be read by people without statistics background and it is helpful as a review for those with prior statistics
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationThe t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies
The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit
More informationBasics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample
More informationInterval Estimation III: Fisher's Information & Bootstrapping
Interval Estimation III: Fisher's Information & Bootstrapping Frequentist Confidence Interval Will consider four approaches to estimating confidence interval Standard Error (+/- 1.96 se) Likelihood Profile
More information16.3 One-Way ANOVA: The Procedure
16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationContents. 22S39: Class Notes / October 25, 2000 back to start 1
Contents Determining sample size Testing about the population proportion Comparing population proportions Comparing population means based on two independent samples Comparing population means based on
More informationComparing Two Variances. CI For Variance Ratio
STAT 503 Two Sample Inferences Comparing Two Variances Assume independent normal populations. Slide For Σ χ ν and Σ χ ν independent the ration Σ /ν Σ /ν follows an F-distribution with degrees of freedom
More informationSEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics
SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS
More information10 One-way analysis of variance (ANOVA)
10 One-way analysis of variance (ANOVA) A factor is in an experiment; its values are. A one-way analysis of variance (ANOVA) tests H 0 : µ 1 = = µ I, where I is the for one factor, against H A : at least
More informationM(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1
Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More information