Blue Not Blue
|
|
- Silvester Cody Boone
- 6 years ago
- Views:
Transcription
1 Name: SOLUTIONS Final Exam (take home, open everything) One mouse study showed Brilliant Blue G, the food coloring used in blue M&Ms, accelerated healing of spinal injuries. Is the proportion of blue M&Ms different for different flavors? Below is the class s count data that made it into REDCap. Milk Chocolate Peanut Pretzel image used without permission from 9GcSScmvx7CpJ5xCwSBIyHM480jiQvTkkW 7rLaTbE70swH 0E8uFMbw Blue Not Blue *Q1 2pts) In mathematical terms, using Greek letters to represent parameters, what is the null hypothesis for a Chi Squared test of this data? Ho: θmilk = θpeanut = θpretzel, where θ represents the probability of being blue. *Q2 2pts) At a 5% significance level, what would be the rejection region for this test? 5.99 (note the test has 2 degrees of freedom) *Q3 5pts) Under the null hypothesis, what is the expected number of Blue Milk Chocolate M&Ms, i.e. when calculating the Chi square test statistic, what would you use for the expected count E for the Blue Milk Chocolate M&Ms? Nmilk = 238. θpooled = 119/462. Emilk = 238*119/462 =
2 *Q4 4pts) Calculate the Chi square test statistic. In R: > chisq.test(matrix(c(66,172,18,77,35,94), nrow=2))$statistic X squared *Q5 2pts Take Home Only) Calculate the p value. In R: > chisq.test(matrix(c(66,172,18,77,35,94), nrow=2))$p.value [1] *Q6 7pts) Set up the calculations for the 95% confidence interval for the relative risk of being blue for milk chocolate vs peanut M&Ms. Write the final solution as (LB, UB) to four decimals. > exp(log((66/238)/(18/95)) 1.96*sqrt(172/66/238+77/18/95)) [1] > exp(log((66/238)/(18/95))+1.96*sqrt(172/66/238+77/18/95)) [1] (0.9205, ) *Q7 7pts) Set up the calculations for the 95% confidence interval for the odds ratio of being blue for milk chocolate vs peanut M&Ms. Write the final solution as (LB, UB) to four decimals. > exp(log((66*77)/(18*172)) 1.96*sqrt(1/66+1/18+1/172+1/77)) [1] > exp(log((66*77)/(18*172))+1.96*sqrt(1/66+1/18+1/172+1/77)) [1] (0.9132, ) 2
3 An M&Ms factory uses trained laborers to detect and remove defective M&Ms, i.e. misshapen or miscolored candies. They are testing a new digital scanning device to see how it performs against the laborers. They create a set of 10,000 M&Ms that they know have 9900 acceptable and 100 defective M&Ms. They run this same set of M&Ms through the laborers regular routine and through the digital scanner. The data for correctly identifying the 100 true positives is below for each method. Laborers Correct Laborers Incorrect Digital Correct Digital Incorrect *Q8 2pts) In mathematical terms, using Greek letters to represent parameters, what is the null hypothesis for a test of this data? Ho: θlaborers = θdigital, where θ represents the sensitivity of the method, i.e. the probability of correctly identifying a truly defective M&M. **Q9 7pts) Calculate an appropriate two sided p value. Write your solution to 4 decimal places. > binom.test( x=9, n=10, p=0.5 ) # best solution Exact binomial test number of successes = 9, number of trials = 10, p value = > prop.test( x=9, n=10, p=0.5, correct = T ) # okay solution 1 sample proportions test with continuity correction X squared = 4.9, df = 1, p value = > mcnemar.test(matrix(c(89,9,1,1), nrow=2), correct=t ) # okay solution McNemar's Chi squared test with continuity correction McNemar's chi squared = 4.9, df = 1, p value = > prop.test( x=9, n=10, p=0.5, correct = F ) # fails to account for small sample size 1 sample proportions test without continuity correction X squared = 6.4, df = 1, p value = > mcnemar.test(matrix(c(89,9,1,1), nrow=2), correct=f ) # fails to account for small sample size McNemar's Chi squared test McNemar's chi squared = 6.4, df = 1, p value =
4 *Q10 3pts) I m at a winter holiday party and, without paying much attention, grab three M&Ms from a bowl. When I sit down on the couch, I notice all three M&Ms are green. I know regular M&Ms are 16% green, whereas the winter holiday M&Ms are evenly divided between just green and red, i.e. 50% green. What is the statistical likelihood the bowl was filled with winter holiday M&Ms? 0.50^3 = *Q11 3pts) What is the statistical likelihood that the bowl was filled with regular M&Ms? 0.16^3 = *Q12 2pts) What is the likelihood ratio test statistic comparing the hypothesis H1: bowl was filled with winter holiday M&Ms vs. H2: bowl was filled with regular M&Ms? 0.50^3/0.16^3 = **Q13 5pts) Interpret the likelihood ratio test statistic, i.e. what does the evidence say and how strong is it? There is moderately strong evidence in favor of the bowl being filled with holiday M&Ms over regular M&Ms. An approximate formula for the surface area of an ellipsoid is and h, w, and d are the radii of the height, width, and depth. Below is a selection of the estimated surface areas of M&Ms from the class s data. Milk Chocolate: 450.2, 334.9, 355.9, Pretzel: 530.9, 769.0, 452.4, where p = *Q14 3pts) What is the null hypothesis for a Wilcoxon Mann Whitney Rank Sum Test comparing the surface areas of milk chocolate and pretzel M&Ms? Ho: Fmilk(X) = Fpretzel(X), i.e. the types of M&Ms have the same distribution. *Q15 3pts) What is the Wilcoxon Mann Whitney Rank Sum Test statistic? Per the lectures, W = RankSumBigger = = 18. You could also standardize that. wilcox.test( c(530.9, 769.0, 452.4), c(450.2, 334.9, 355.9, 345.2) ) gives W = 12. wilcox.test(c(450.2, 334.9, 355.9, 345.2), c(530.9, 769.0, 452.4) ) gives W = 0. They are all based on mathematically equivalent expressions. **Q16 6pts) Provide a two sided p value to four decimals for the rank sum test. By hand: 2 * 1 / (7 choose 4) = 2/35 = By R: p value =
5 A summary of the class s entire data for the surface area of Milk Chocolate and Pretzel M&Ms follows. Milk Chocolate: mean = sd = N = 42. Pretzel: mean = sd = N = 25. *Q17 2pts) In mathematical terms, using Greek letters to represent parameters, what is the null hypothesis for an equal variance t test comparing the surface areas of milk chocolate and pretzel M&Ms? Ho: µ milk = µ pretzel, where µ represents the true mean surface area. *Q18 2pts) What is the appropriate degrees of freedom for the test statistic? = 65 *Q19 2pts) What is the rejection region at a 5% significance level for a two sided alternative? Conservative estimate from Rice Table 4 uses 60 df = 2.000, or from R, qt(0.975, 65) = **Q20 7pts) What is the observed test statistic to two decimal places? > ( )/sqrt((41* ^2+24* ^2)*(1/42+1/25)/65)
6 While we used the Wilcoxon Mann Whitney Rank Sum test on the small selection of M&Ms and the equal variance t test on the full dataset, we could have used the Wilcoxon Mann Whitney Rank Sum test in both cases. Define the relative efficiency for a two sided 5% level test comparing the two statistical tests as the ratio of the tests power under certain conditions, i.e. RE = Power(Wilcoxon Mann Whitney Rank Sum test) / Power(equal variance t test). Calculate the relative efficiency under the following settings. ***Q21 4pts) A milk chocolate ~ N(μ=490, σ=10). N milk chocolate = 42. A pretzel ~ N(μ=500, σ=50). N pretzel = 25. # So one experiment looks like the following. Amilk = rnorm(n=42, mean=490, sd=10) Apret = rnorm(n=25, mean=500, sd=50) # The equal variance t test and wilcoxon test p values are: t.test( Amilk, Apret, var.equal=t)$p.value wilcox.test( Amilk, Apret)$p.value # So now we just need to put this all in a big loop and save the p values. The number of p values < 0.05 divided by the number of loops is the power. Nloops = 10^6 pvalst = rep( NA, Nloops ) pvalsw = rep( NA, Nloops ) for( loop in 1:Nloops ){ Amilk = rnorm(n=42, mean=490, sd=10) Apret = rnorm(n=25, mean=500, sd=50) pvalst[loop] = t.test( Amilk, Apret, var.equal=t)$p.value pvalsw[loop] = wilcox.test( Amilk, Apret)$p.value } powert = sum(pvalst < 0.05)/Nloops powerw = sum(pvalsw < 0.05)/Nloops RE = powerw / powert options(scipen=20) # don t use scientific notation c( powerw, powert, RE) So the power is pretty low for both (<30%), but the Wilcoxon test is only about 81% as efficient as the equal variance t test. However, this is hiding an insidious fact. Consider the following simulation. 6
7 pvalst = rep( NA, Nloops ) pvalsu = rep( NA, Nloops ) # unequal var t test pvalsw = rep( NA, Nloops ) for( loop in 1:Nloops ){ Amilk = rnorm(n=42, mean=500, sd=10) # identical means Apret = rnorm(n=25, mean=500, sd=50) pvalst[loop] = t.test( Amilk, Apret, var.equal=t)$p.value pvalsu[loop] = t.test( Amilk, Apret, var.equal=f)$p.value pvalsw[loop] = wilcox.test( Amilk, Apret)$p.value } powert = sum(pvalst < 0.05)/Nloops poweru = sum(pvalsu < 0.05)/Nloops powerw = sum(pvalsw < 0.05)/Nloops c( powerw, powert, poweru) # This is Type I error. Only the Wilcoxon test has the proper Type I error rate for this situation. ***Q22 4pts) A milk chocolate ~ t df=3 (μ=490, σ=10). N milk chocolate = 42. A pretzel ~ t df=3 (μ=500, σ=50). N pretzel = 25. By t df=3 (μ, σ), I mean a standard t df=3 distribution scaled and shifted to have mean μ and standard deviation σ. This is similar to the Q21, but has a challenge in figuring out how to simulate the data. Let s try the default df=3 in R. > x = rt(n=10^7,df=3) > mean(x) # That s pretty close to 0, which we know is right. > sd(x) # That s nowhere near 1. What s up? A quick search on Wikipedia tells us Var(t df=3) = 3. So the sd=sqrt(3) = ~ # So one experiment looks like the following. Amilk = rt(n=42, df=3)*10/sqrt(3) Apret = rt(n=25, df=3)*50/sqrt(3) # The equal variance t test and wilcoxon test p values are: t.test( Amilk, Apret, var.equal=t)$p.value wilcox.test( Amilk, Apret)$p.value # So now we just need to put this all in a big loop and save the p values. The number of p values < 0.05 divided by the number of loops is the power. Nloops = 10^5 pvalst = rep( NA, Nloops ) 7
8 pvalsw = rep( NA, Nloops ) for( loop in 1:Nloops ){ Amilk = rt(n=42, df=3)*10/sqrt(3) Apret = rt(n=25, df=3)*50/sqrt(3) pvalst[loop] = t.test( Amilk, Apret, var.equal=t)$p.value pvalsw[loop] = wilcox.test( Amilk, Apret)$p.value } powert = sum(pvalst < 0.05)/Nloops powerw = sum(pvalsw < 0.05)/Nloops RE = powerw / powert options(scipen=20) # don t use scientific notation c( powerw, powert, RE) So fairly low power for both tests (<40%), but the Wilcoxon test was ~10% more powerful. Note there is a lot more that I could do with this, including creating a CI (this is essentially a relative risk if we ignore the paired nature of the data) and performing a McNemar's test to see if the power's are statistically different (utilizing the paired nature of the data). 8
9 ***Q23 4pts) A milk chocolate ~ exponential(μ=490, σ=10). N milk chocolate = 42. A pretzel ~ exponential(μ=500, σ=50). N pretzel = 25. By exponential(μ, σ), I mean a standard exponential distribution scaled and shifted to have mean μ and standard deviation σ. Cole s solutions for all three: set.seed(68) sims< 10^6 res< data.frame(matrix(0, ncol=3, nrow=sims)) wes< data.frame(matrix(0, ncol=3, nrow=sims)) for(i in seq(sims)) { # generate normal distribution a1=rnorm(42,490,10) b1=rnorm(25,500,50) if(t.test(a1,b1,var.equal=true)$p.value <= 0.05) res[i,1] < 1 if(wilcox.test(a1, b1)$p.value <= 0.05) wes[i,1] < 1 # generate t distribution a2=rt(42, 3)*10/sqrt(3)+490 b2=rt(25, 3)*50/sqrt(3)+500 if(t.test(a2,b2,var.equal=true)$p.value <= 0.05) res[i,2] < 1 if(wilcox.test(a2, b2)$p.value <= 0.05) wes[i,2] < 1 # generate exponential distribution a3=rexp(42, 1/10)+480 b3=rexp(25, 1/50)+450 if(t.test(a3,b3,var.equal=true)$p.value <= 0.05) res[i,3] < 1 if(wilcox.test(a3, b3)$p.value <= 0.05) wes[i,3] < 1 } # colmeans(wes) contains the estimated power of each Wilcoxon # colmeans(res) contains the estimated power of each t test round( colmeans(wes)/colmeans(res), 3) # relative efficiency # Note, I stopped the simulation early at i =
10 ***Q24 12pts) Previously we had looked at the volume of M&Ms assuming they were spherical or oblate spheroids. Based on the class s data that made it into REDCap, the Peanut M&Ms were more irregularly shaped than we had assumed. Here are the means and standard deviations for the height, width, and depth radii (diameters/2). h = height radius. mean(h) = mm. sd(h) = mm. w = height radius. mean(w) = mm. sd(w) = mm. d = height radius. mean(d) = mm. sd(d) = mm. N = 32 peanut M&Ms measured. The volume of an ellipsoid is:. Using the delta method, derive a formula and calculate a 95% confidence interval for the volume of a Peanut M&M. Express your solution as (LB, UB) to two decimal places. Var[ log(vol) ] = Var[ log( 4pi/3 * hwd ) ] = Var[ log( 4pi/3 ) + log( h ) + log( w ) + log( d ) ] = Var[ log( h ) ] + Var[ log( w ) ] + Var[ log( d ) ] by independence and Var[constant]=0 = Var[ h ] * (1/h 2 ) + Var[ w ] * (1/w 2 ) + Var[ d ] * (1/d 2 ) sd h 2 /h 2 + sd w 2 /w 2 + sd d 2 /d 2 Assuming normality for the sample distribution of vol, a 95% CI will be exp{ log( 4pi/3 * hwd ) * sqrt( sd h 2 /h 2 + sd w 2 /w 2 + sd d 2 /d 2 ) } = exp( log( 4*pi/3 * * * ) * sqrt( ^2/32/ ^ ^2/32/ ^ ^2/32/ ^2 ) ) = ( , ). For comparison, let s create a CI based on the individual estimated surface areas for each M&M. a = c( ,1979.2, , , , , , , , , , , , ,1810.6,994.84,923.63,1810.6, ,293.22, ,1504.3, ,376.99, ,293.22, , , ,282.74, , ) mean(a) 1.96*sd(a)/sqrt(length(a)) mean(a)+1.96*sd(a)/sqrt(length(a)) ( , ) 10
Name: SOLUTIONS Final Part 1 (In class, solo work, open book and notes)
Name: SOLUTIONS Final Part 1 (In class, solo work, open book and notes) Throughout the exam, show your work and, unless specified otherwise, round all your final answers to 3 decimal places, e.g. 1.0015
More informationph: 5.2, 5.6, 5.8, 6.4, 6.5, 6.8, 6.9, 7.2, 7.5 sample mean = sample sd = sample size, n = 9
Name: SOLUTIONS Final Part 1 (100 pts) and Final Part 2 (120 pts) For all of the questions below, please show enough work that it is completely clear how your final solution was derived. Sit at least one
More informationmu_(x Y) = E[ X Y ] by definition. E[ X Y ] = E[ X ] E[ Y ], regardless of the independence of X & Y. = mu_x mu_y by definition.
Your Full Name: SOLUTIONS Bios 311 Exam 2, final exam part 1 (100 pts) This quiz is open book, notes, and calculator; and closed laptop, cellphone, classmates, etc. Where necessary, work your solution
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationFish SR P Diff Sgn rank Fish SR P Diff Sng rank
Nonparametric tests Distribution free methods require fewer assumptions than parametric methods Focus on testing rather than estimation Not sensitive to outlying observations Especially useful for cruder
More informationName: SOLUTIONS Exam 01 (Midterm Part 2 take home, open everything)
Name: SOLUTIONS Exam 01 (Midterm Part 2 take home, open everything) To help you budget your time, questions are marked with *s. One * indicates a straightforward question testing foundational knowledge.
More informationAdditional Problems Additional Problem 1 Like the http://www.stat.umn.edu/geyer/5102/examp/rlike.html#lmax example of maximum likelihood done by computer except instead of the gamma shape model, we will
More informationChapter 7 Comparison of two independent samples
Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationIntroductory Statistics with R: Simple Inferences for continuous data
Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail: sungkyu@pitt.edu
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationDealing with the assumption of independence between samples - introducing the paired design.
Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such
More informationComparison of two samples
Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly
More informationCh. 7. One sample hypothesis tests for µ and σ
Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationGov Univariate Inference II: Interval Estimation and Testing
Gov 2000-5. Univariate Inference II: Interval Estimation and Testing Matthew Blackwell October 13, 2015 1 / 68 Large Sample Confidence Intervals Confidence Intervals Example Hypothesis Tests Hypothesis
More informationComparing Two Variances. CI For Variance Ratio
STAT 503 Two Sample Inferences Comparing Two Variances Assume independent normal populations. Slide For Σ χ ν and Σ χ ν independent the ration Σ /ν Σ /ν follows an F-distribution with degrees of freedom
More information18.05 Practice Final Exam
No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For
More informationHow do we compare the relative performance among competing models?
How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationSession 3 The proportional odds model and the Mann-Whitney test
Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session
More informationA3. Statistical Inference Hypothesis Testing for General Population Parameters
Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,
More informationExercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer
Solutions to Exam in 02402 December 2012 Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer 3 1 5 2 5 2 3 5 1 3 Exercise IV.2 IV.3 IV.4 V.1
More informationTA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM
STAT 301, Fall 2011 Name Lec 4: Ismor Fischer Discussion Section: Please circle one! TA: Sheng Zhgang... 341 (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan... 345 (W 1:20) / 346 (Th
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationContents 1. Contents
Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationOutline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II
Week 13 Comparing Two Populations, Part II Week 13 Objectives Coverage of the topic of comparing two population continues with new procedures and a new sampling design. The week concludes with a lab session.
More informationFrequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=
A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review In the previous lecture we considered the following tests: The independent
More informationD. A 90% confidence interval for the ratio of two variances is (.023,1.99). Based on the confidence interval you will fail to reject H 0 =!
SMAM 314 Review for Exam 3 1. Mark the following statements true (T) or false(f) A. A null hypothesis that is rejected at α=.01 will always be rejected at α=.05. Β. One hundred 90% confidence intervals
More informationComparison of Two Samples
2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation
More informationConfidence intervals
Confidence intervals We now want to take what we ve learned about sampling distributions and standard errors and construct confidence intervals. What are confidence intervals? Simply an interval for which
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationComparison of Two Population Means
Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationNon-parametric tests, part A:
Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationOutline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews
Outline Outline PubH 5450 Biostatistics I Prof. Carlin Lecture 11 Confidence Interval for the Mean Known σ (population standard deviation): Part I Reviews σ x ± z 1 α/2 n Small n, normal population. Large
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More information8 Comparing Two Populations via Independent Samples, Part 1/2
8 Comparing Two Populations via Independent Samples, Part /2 We ll study these ways of comparing two populations from independent samples:. The Two-Sample T-Test (Normal with Equal Variances) 2. The Welch
More informationAre the Digits in a Mersenne Prime Random? A Probability Model
Are the in a Are the in a Prime Random? October 11, 2016 1 / 21 Outline Are the in a 1 2 3 4 5 2 / 21 Source Are the in a This talk is my expansion of the blog post: Strings of in, gottwurfelt.com, Jan
More information4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures
Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationContingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.
Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,
More informationStatistical Inference Theory Lesson 46 Non-parametric Statistics
46.1-The Sign Test Statistical Inference Theory Lesson 46 Non-parametric Statistics 46.1 - Problem 1: (a). Let p equal the proportion of supermarkets that charge less than $2.15 a pound. H o : p 0.50 H
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More informationPhysics 509: Non-Parametric Statistics and Correlation Testing
Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests
More information11-2 Multinomial Experiment
Chapter 11 Multinomial Experiments and Contingency Tables 1 Chapter 11 Multinomial Experiments and Contingency Tables 11-11 Overview 11-2 Multinomial Experiments: Goodness-of-fitfit 11-3 Contingency Tables:
More informationHYPOTHESIS TESTING: FREQUENTIST APPROACH.
HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous
More informationNonparametric Location Tests: k-sample
Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationRelax and good luck! STP 231 Example EXAM #2. Instructor: Ela Jackiewicz
STP 31 Example EXAM # Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.
More informationOne-Sample and Two-Sample Means Tests
One-Sample and Two-Sample Means Tests 1 Sample t Test The 1 sample t test allows us to determine whether the mean of a sample data set is different than a known value. Used when the population variance
More informationTopic 15: Simple Hypotheses
Topic 15: November 10, 2009 In the simplest set-up for a statistical hypothesis, we consider two values θ 0, θ 1 in the parameter space. We write the test as H 0 : θ = θ 0 versus H 1 : θ = θ 1. H 0 is
More informationSpace Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses
Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationLecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t
Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationGEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs
STATISTICS 4 Summary Notes. Geometric and Exponential Distributions GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs P(X = x) = ( p) x p x =,, 3,...
More informationSTAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.
STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances
More informationPower and the computation of sample size
9 Power and the computation of sample size A statistical test will not be able to detect a true difference if the sample size is too small compared with the magnitude of the difference. When designing
More informationChapter 8 Class Notes Comparison of Paired Samples
Chapter 8 Class Notes Comparison of Paired Samples In this chapter, we consider the analysis of paired data. To illustrate, (in the spirit of p.332 ex.8.s.5) an agronomist randomly selected six wheat plants
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationData analysis and Geostatistics - lecture VII
Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationCDA Chapter 3 part II
CDA Chapter 3 part II Two-way tables with ordered classfications Let u 1 u 2... u I denote scores for the row variable X, and let ν 1 ν 2... ν J denote column Y scores. Consider the hypothesis H 0 : X
More informationNormal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):
Lecture Three Normal theory null distributions Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT): A random variable which is a sum of many
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationFormulas and Tables by Mario F. Triola
Copyright 010 Pearson Education, Inc. Ch. 3: Descriptive Statistics x f # x x f Mean 1x - x s - 1 n 1 x - 1 x s 1n - 1 s B variance s Ch. 4: Probability Mean (frequency table) Standard deviation P1A or
More informationLecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
s Sign s Lecture 26 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007 s Sign s 1 2 3 s 4 Sign 5 6 7 8 9 10 s s Sign 1 Distribution-free
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More informationLab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d
BIOS 4120: Introduction to Biostatistics Breheny Lab #11 We will explore observational studies in today s lab and review how to make inferences on contingency tables. We will only use 2x2 tables for today
More information6 Sample Size Calculations
6 Sample Size Calculations A major responsibility of a statistician: sample size calculation. Hypothesis Testing: compare treatment 1 (new treatment) to treatment 2 (standard treatment); Assume continuous
More informationDescriptive Statistics
Descriptive Statistics Once an experiment is carried out and the results are measured, the researcher has to decide whether the results of the treatments are different. This would be easy if the results
More informationReview 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2
Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that
More informationName: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm
Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with
More informationNon-Parametric Statistics: When Normal Isn t Good Enough"
Non-Parametric Statistics: When Normal Isn t Good Enough" Professor Ron Fricker" Naval Postgraduate School" Monterey, California" 1/28/13 1 A Bit About Me" Academic credentials" Ph.D. and M.A. in Statistics,
More informationSTAT 328 (Statistical Packages)
Department of Statistics and Operations Research College of Science King Saud University Exercises STAT 328 (Statistical Packages) nashmiah r.alshammari ^-^ Excel and Minitab - 1 - Write the commands of
More informationClassroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats
Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats Materials Needed: Bags of popcorn, watch with second hand or microwave with digital timer. Instructions: Follow the instructions on the
More informationST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD
ST21 Design and Analysis of Clinical Trials Lecture 9: Non-parametric procedures for CRBD Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 9, 2016 Outline Nonparametric tests
More informationSmart Home Health Analytics Information Systems University of Maryland Baltimore County
Smart Home Health Analytics Information Systems University of Maryland Baltimore County 1 IEEE Expert, October 1996 2 Given sample S from all possible examples D Learner L learns hypothesis h based on
More informationHypothesis testing. Data to decisions
Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the
More informationModule 9: Nonparametric Statistics Statistics (OA3102)
Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationEconomics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,
Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem
More informationECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12
ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 Winter 2012 Lecture 13 (Winter 2011) Estimation Lecture 13 1 / 33 Review of Main Concepts Sampling Distribution of Sample Mean
More informationStatistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018
Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement
More information