Inferential Statistics and Methods for Tuning and Configuring

Size: px
Start display at page:

Download "Inferential Statistics and Methods for Tuning and Configuring"

Transcription

1 DM63 HEURISTICS FOR COMBINATORIAL OPTIMIZATION Lecture 13 and Methods for Tuning and Configuring Marco Chiarandini 1 Summar 2 3 Eperimental Designs 4 Sequential Testing (Racing) Summar 1 Summar 2 3 Eperimental Designs 1 Algorithm Engineering and Eperimental Analsis Definitions Performance Measures 2 Eplorator Data Analsis Representation of Sampled Data Regression Analsis 4 Sequential Testing (Racing) Algorithm Engineering and Eperimental Algoritmics Summar Single-pass heuristics Asmptotic heuristics: Two approaches: 1 Univariate 1a Time as an eternal parameter decided a priori 1b Solution qualit as an eternal parameter decided a priori 2 Cost dependent on running time: 1 Summar 2 We work with samples (instances, solution qualit) But we want sound conclusions: generalization over a given population (all possible instances) Thus we need statistical inference 3 Eperimental Designs 4 Sequential Testing (Racing)

2 We work with samples (instances, solution qualit) But we want sound conclusions: generalization over a given population (all possible instances) Thus we need statistical inference We work with samples (instances, solution qualit) But we want sound conclusions: generalization over a given population (all possible instances) Thus we need statistical inference Random Sample X n Statistical Estimator Inference Population P(, θ) Parameter θ Random Sample X n Statistical Estimator Inference Population P(, θ) Parameter θ Since the analsis is based on finite-sized sampled data, statements like the cost of solutions returned b algorithm A is smaller than that of algorithm B must be completed b at a level of significance of 5% A Motivating Eample A Motivating Eample There is a competition and two algorithms A 1 and A 2 are submitted We run both algorithms once on n instances On each instance either A 1 wins (+) or A 2 wins (-) or the make a tie (=) Questions: 1 If we have onl 10 instances and algorithm A 1 wins 7 times how confident are we in claiming that algorithm A 1 is the best? 2 How man instances and how man wins should we observe to gain a confidence of 95% that the algorithm A 1 is the best? A Motivating Eample A Motivating Eample Y b(n, p) : Pr[Y = ] = ( ) n p (1 p) n Y b(n, p) : Pr[Y = ] = ( ) n p (1 p) n Under the conditions of question 1, we can check how unlikel the situation is if it were p(+) p( ) If p = 05 then the chance that algorithm A 1 wins 7 or more times out of 10 is 172%: quite high! b(10,05) A Motivating Eample Y b(n, p) : Pr[Y = ] = Under the conditions of question 1, we can check how unlikel the situation is if it were p(+) p( ) If p = 05 then the chance that algorithm A 1 wins 7 or more times out of 10 is 172%: quite high! ( ) n p (1 p) n To answer question 2 we compute the 95% quantile, ie, : Pr[Y ] < 005 with p = 05 at different values of n: n General procedure: Assume that data are consistent with a null hpothesis H 0 (eg, sample data are drawn from distributions with the same mean value) Use a statistical test to compute how likel this is to be true, given the data collected This likel is quantified as the p-value Accept H 0 as true if the p-value is larger than an user defined threshold called level of significance α Alternativel (p-value < α), H 0 is rejected in favor of an alternative hpothesis, H 1, at a level of significance of α n

3 Two kinds of errors ma be committed when testing hpothesis: General rule: α = P(tpe I error) = P(reject H 0 H 0 is true) β = P(tpe II error) = P(fail to reject H 0 H 0 is false) 1 specif the tpe I error or level of significance α 2 seek the test with a suitable large statistical power, ie, 1 β = P(reject H 0 H 0 is false) Theorem: Central Limit Theorem If X n is a random sample from an arbitrar distribution with mean µ and variance σ then the average X n is asmptoticall normall distributed, ie, X n N(µ, σ2 n ) or z = X n µ σ/ N(0, 1) n Consequences: allows inference from a sample allows to model errors in measurements: X = µ + ɛ Issues: n should be enough large µ and σ must be known Weibull distribution Inference: Hpothesis Testing and Confidence Intervals dweibull(, shape = 14) z = X µ σ/ n A test of hpothesis determines how likel an sampled estimate ^θ is to occur under some assumptions on the parameter θ of the population σ Pr µ z 1 n X σ } µ+z 2 n = 1 α µ X1 X2 X3 Samples of size 1, 5, 15, 50 repeated 100 times n=1 n=5 n= n=50 A confidence interval contains all those values that a parameter θ is likel to assume with probabilit 1 α: Pr(^θ 1 < θ < ^θ 2 ) = 1 α σ Pr X z1 n µ X+z σ } 2 n = 1 α µ X1 X2 X3 The Procedure of Test of Hpothesis The Confidence Intervals Procedure µ 1 µ 2 θ 1 Specif the parameter θ and the test hpothesis, 2 Obtain P(θ θ = 0), the null distribution of θ 3 Compare ^θ with the α/2-quantiles (for two-sided tests) of P(θ θ = 0) and reject or not H 0 according to whether ^θ is larger or smaller than this value µ 1 µ 2 θ θ = 0 1 Specif the parameter θ and the test hpothesis, 2 Obtain P(θ, θ = 0), the null distribution of θ in correspondence of the observed estimate ^θ of the sample X 3 Determine (^θ, ^θ + ) such that Pr^θ θ ^θ + } = 1 α 4 Do not reject H 0 if θ = 0 falls inside the interval (^θ, ^θ + ) Otherwise reject H 0 The Confidence Intervals Procedure The Confidence Intervals Procedure N(µ 1, σ) N(µ 2, σ) ( X 1, S X1 ) ( X 2, S X2 ) θ = 0 T = ( X 1 X 2 ) `µ 1 µ r 2 SX1 S X2 r T Student s t Distribution 1 Specif the parameter θ and the test hpothesis, 2 Obtain P(θ, θ = 0), the null distribution of θ in correspondence of the observed estimate ^θ of the sample X 3 Determine (^θ, ^θ + ) such that Pr^θ θ ^θ + } = 1 α 4 Do not reject H 0 if θ = 0 falls inside the interval (^θ, ^θ + ) Otherwise reject H 0 P(θ 1 ) P(θ 2 ) θ = X 1 X 2 θ = 0 1 Specif the parameter θ and the test hpothesis, 2 Obtain P(θ, θ = 0), the null distribution of θ in correspondence of the observed estimate ^θ of the sample X 3 Determine (^θ, ^θ + ) such that Pr^θ θ ^θ + } = 1 α 4 Do not reject H 0 if θ = 0 falls inside the interval (^θ, ^θ + ) Otherwise reject H 0

4 Kolmogorov-Smirnov Tests Parametric vs Nonparametric The test compares empirical cumulative distribution functions F() F() It uses maimal difference between the two curves, sup F 1 () F 2 (), and assesses how likel this value is under the null hpothesis that the two curves come from the same data The test can be used as a two-samples or single-sample test (in this case to test against theoretical distributions: goodness of fit) R functions: F() Preparation of the Eperiments 2 Parametric assumptions: independence homoschedasticit normalit N(µ, σ) Nonparametric assumptions: independence homoschedasticit P(θ) tests Permutation tests Eact Conditional Monte Carlo The test can be done in R with kstest Variance reduction techniques Same pseudo random seed 1 Summar Sample Sizes If the sample size is large enough (infinit) an difference in the means of the factors, no matter how small, will be significant Real vs Statistical significance Stud factors until the improvement in the response variable is deemed small Desired statistical power + practical precision sample size Note: If resources available for N runs then the optimal design is one run on N instances [?] 2 3 Eperimental Designs 4 Sequential Testing (Racing) The Design of Eperiments for Algorithms Eperimental Design Statement of the objectives of the eperiment Comparison of different algorithms Impact of algorithm components How instance features affect the algorithms Identification of the sources of variance Treatment factors (qualitative and quantitative) Controllable nuisance factors blocking Uncontrollable nuisance factors measuring Definition of factor combinations to test Easiest design: Unreplicated or Replicated Full Factorial Design Running a pilot eperiment and refine the design Bugs and no eternal biases Ceiling or floor effects Rescaling levels of quantitative factors Detect the number of eperiments needed to obtained the desired power Algorithms Treatment Factor; Instances Blocking Factor Design A: One run on various instances (Unreplicated Factorial) Algorithm 1 Algorithm 2 Algorithm k Instance 1 X 11 X 12 X 1k Instance b X b1 X b2 X bk Design B: Several runs on various instances (Replicated Factorial) Algorithm 1 Algorithm 2 Algorithm k Instance 1 X 111,, X 11r X 121,, X 12r X 1k1,, X 1kr Instance 2 X 211,, X 21r X 221,, X 22r X 2k1,, X 2kr Instance b X b11,, X b1r X b21,, X b2r X bk1,, X bkr Multiple Comparisons Univariate Analsis: On a single instance H 0 : µ 1 = µ 2 = µ 3 = H 1 : at least one differs} Appling a statistical test to all pairs the error of Tpe I is not α but higher: α EX = 1 (1 α) c Eg, for α = 005 and c = 3 α EX = 014! Adjustment methods Protected versions: global test + no adjustments Bonferroni α = α EX /c (conservative) Tuke Honest Significance Method (for parametric analsis) Holm (step-wise) Other step procedures Post-hoc analsis: Once the effect of factors has been recognized a finer grained analsis is performed to distinguish where important differences are Global tests Parametric KS tpe Replicated F-test Kruskall-Wallis Test Pooled Permutations Birnbaum-Hall test

5 Univariate Analsis: On a single instance Univariate Analsis: On a Class of Instances Pairwise tests Replicated Parametric KS tpe t-test Tuke HSD Kruskall-Wallis Test or Mann-Whitne test Wilcoon Rank Sum Test Pooled Permutations Birnbaum-Hall test Global tests Unreplicated Replicated Parametric F-test F-test Simple Permutations Snchronized Permutations Matched pairs versions: when, when not t-test with different variances Univariate Analsis: On a Class of Instances An Eample Pairwise tests Unreplicated Replicated Parametric t-test Tuke HSD or Wilcoon Signed Rank Test Simple Permutations Matched pairs versions: when, when not t-test Welch variant: no assumption of equal variances t-test Tuke HSD Snchronized Permutations SLS algorithms for Graph Coloring: Results collected on a set of benchmark instances Instance TSN1 Instance Succ k Succ k Succ k Succ k Succ k flat flat flat flat flat flat GLS SAN2 TSN3 Instance Succ k Succ k Succ k Succ k flat flat flat flat flat flat An Eample Raw data on the instances: flat300_20_ flat1000_50_0 flat300_26_ flat1000_60_0 flat300_28_ flat1000_76_0 R functions: > load( gcp all classesdatar ) > G <- F[F$class== Flat,] > bwplot(alg ~ col inst,data=g,scales=list(=list(relation= free )),pch= ) > boplot(err3~alg,data=g,horizontal=true,main=epression(paste( Invariant error:, frac(-^(opt),^(worst)-^(opt)))),notch=true,col= pink ) > boplot(rank~alg,data=g,horizontal=true,main= Ranks,notch=TRUE,col= pink ) col An Eample An Eample (opt) Invariant error: (worst) (opt) (opt) Invariant error: (worst) (opt) Ranks Ranks Note: notches are not appropriate for comparative inference

6 R functions: R functions: > par(las=1,mar=c(3,8,3,1)) > plot(tukehsd(aov(err3~alg*inst,data=g),which= alg ),las=1,mar=c(3,7,3,1)) > pairwisewilcotest(g$err3,g$alg,paired=true) Pairwise comparisons using Wilcoon rank sum test data: G$err3 and G$alg e e e e-07 30e e-08 58e-10 58e-10 58e-10 58e-10 58e-10 58e-10 58e-10 P value adjustment method: holm 95% famil wise confidence level An Eample An Eample Alg 3 Alg 2 Alg 1 MSD 2 X1 X2 X1 X2 X3 Minimal Significant Difference (MSD) interval that satisfies simultaneousl each comparison Differences are statisticall significant if the confidence intervals do not overlap Average Inveriant Error (Tuke's Honset Significance Difference) Average Inveriant Error (Permutation Test) Average Rank () Unreplicated Designs 1 Summar 2 3 Eperimental Designs 4 Sequential Testing (Racing) Procedure Race [?]: repeat Randoml select an unseen instance and run all candidates on it Perform all-pairwise comparison statistical tests Drop all candidates that are significantl inferior to the best algorithm until onl one candidate left or no more unseen instances ; F-Race use Friedman test Note: If resources available for N runs then the optimal design is one run on N instances [?] Replicated Designs Sequential Testing 7 algorithms, 5 instances, 3 runs Procedure Race repeat Run all candidates once on all instances Perform all-pairwise comparison statistical tests Drop all candidates that are significantl inferior to the best algorithm until onl one candidate left or maimal number of runs eceeded ;

7 Sequential Testing Sequential Testing 7 algorithms, 5 instances, 3 runs 7 algorithms, 5 instances, 3 runs 3 algorithms, 5 instances, 9 runs 3 algorithms, 5 instances, 9 runs 2 algorithms, 5 instances, 30 runs Sequential Testing class GEOMb (11 Instances) S_Seq_SL_Y O_DCFA O_DCFB O_CCFB S_RLF_Y O_CCFA S_RLF_N O_DRRB O_DRRA S_D_s_N O_CRRB O_DCRA O_CRRA S_D_g_N O_DCRB O_CCRA O_CCRB S_D_g_Y S_D_s_Y Stage

Outline. 2. Sequential Testing (Racing) Marco Chiarandini. Tests for our Scenarios Example. Parametric F-test F-test

Outline. 2. Sequential Testing (Racing) Marco Chiarandini. Tests for our Scenarios Example. Parametric F-test F-test Outline DM812 METAHEURISTICS Lecture 4 Empirical Methods for Configuring and Tuning 1. Statistical 2. (Racing) Marco Chiarandini Department of Mathematics and Computer Science University of Southern Denmark,

More information

Compare several group means: ANOVA

Compare several group means: ANOVA 1 - Introduction Compare several group means: ANOVA Pierre Legendre, Université de Montréal August 009 Objective: compare the means of several () groups for a response variable of interest. The groups

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D. Research Design - - Topic 15a Introduction to Multivariate Analses 009 R.C. Gardner, Ph.D. Major Characteristics of Multivariate Procedures Overview of Multivariate Techniques Bivariate Regression and

More information

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science Statistiek II John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa Dept of Information Science j.nerbonne@rug.nl February 13, 2014 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated

More information

More about Single Factor Experiments

More about Single Factor Experiments More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

Math 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Math 141. Lecture 16: More than one group. Albyn Jones 1.   jones/courses/ Library 304. Albyn Jones Math 141 Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Comparing two population means If two distributions have the same shape and spread,

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Workshop Research Methods and Statistical Analysis

Workshop Research Methods and Statistical Analysis Workshop Research Methods and Statistical Analysis Session 2 Data Analysis Sandra Poeschl 08.04.2013 Page 1 Research process Research Question State of Research / Theoretical Background Design Data Collection

More information

where x i and u i are iid N (0; 1) random variates and are mutually independent, ff =0; and fi =1. ff(x i )=fl0 + fl1x i with fl0 =1. We examine the e

where x i and u i are iid N (0; 1) random variates and are mutually independent, ff =0; and fi =1. ff(x i )=fl0 + fl1x i with fl0 =1. We examine the e Inference on the Quantile Regression Process Electronic Appendix Roger Koenker and Zhijie Xiao 1 Asymptotic Critical Values Like many other Kolmogorov-Smirnov type tests (see, e.g. Andrews (1993)), the

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Math Review Summer 0 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classif the hpothesis test as two-tailed, left-tailed, or right-tailed.

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

Data Mining. CS57300 Purdue University. March 22, 2018

Data Mining. CS57300 Purdue University. March 22, 2018 Data Mining CS57300 Purdue University March 22, 2018 1 Hypothesis Testing Select 50% users to see headline A Unlimited Clean Energy: Cold Fusion has Arrived Select 50% users to see headline B Wedding War

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Basic Statistical Analysis

Basic Statistical Analysis indexerrt.qxd 8/21/2002 9:47 AM Page 1 Corrected index pages for Sprinthall Basic Statistical Analysis Seventh Edition indexerrt.qxd 8/21/2002 9:47 AM Page 656 Index Abscissa, 24 AB-STAT, vii ADD-OR rule,

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Kruskal-Wallis and Friedman type tests for. nested effects in hierarchical designs 1

Kruskal-Wallis and Friedman type tests for. nested effects in hierarchical designs 1 Kruskal-Wallis and Friedman type tests for nested effects in hierarchical designs 1 Assaf P. Oron and Peter D. Hoff Department of Statistics, University of Washington, Seattle assaf@u.washington.edu, hoff@stat.washington.edu

More information

From the help desk: It s all about the sampling

From the help desk: It s all about the sampling The Stata Journal (2002) 2, Number 2, pp. 90 20 From the help desk: It s all about the sampling Allen McDowell Stata Corporation amcdowell@stata.com Jeff Pitblado Stata Corporation jsp@stata.com Abstract.

More information

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics Chapter 13 Student Lecture Notes 13-1 Department of Quantitative Methods & Information Sstems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analsis QMIS 0 Dr. Mohammad

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

Group comparison test for independent samples

Group comparison test for independent samples Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations

More information

Multiple Comparison Testing for Experimental Chemotherapy Based on Multivariate Covariance Analysis

Multiple Comparison Testing for Experimental Chemotherapy Based on Multivariate Covariance Analysis Journal of Statistical and Econometric Methods, vol., no., 0, -0 ISSN: 9-0 (print), 9-99 (online) Scienpress Ltd, 0 Multiple Comparison Testing for Experimental Chemotherap Based on Multivariate Covariance

More information

13: Additional ANOVA Topics

13: Additional ANOVA Topics 13: Additional ANOVA Topics Post hoc comparisons Least squared difference The multiple comparisons problem Bonferroni ANOVA assumptions Assessing equal variance When assumptions are severely violated Kruskal-Wallis

More information

Math 475. Jimin Ding. August 29, Department of Mathematics Washington University in St. Louis jmding/math475/index.

Math 475. Jimin Ding. August 29, Department of Mathematics Washington University in St. Louis   jmding/math475/index. istical A istic istics : istical Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html August 29, 2013 istical August 29, 2013 1 / 18 istical A istic

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

Definitions of terms and examples. Experimental Design. Sampling versus experiments. For each experimental unit, measures of the variables of

Definitions of terms and examples. Experimental Design. Sampling versus experiments. For each experimental unit, measures of the variables of Experimental Design Sampling versus experiments similar to sampling and inventor design in that information about forest variables is gathered and analzed experiments presuppose intervention through appling

More information

Statistics for EES Factorial analysis of variance

Statistics for EES Factorial analysis of variance Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA

More information

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test la Contents The two sample t-test generalizes into Analysis of Variance. In analysis of variance ANOVA the population consists

More information

Contrasts and Multiple Comparisons Supplement for Pages

Contrasts and Multiple Comparisons Supplement for Pages Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis

More information

Non-parametric tests, part A:

Non-parametric tests, part A: Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 4: Two-point Sampling, Coupon Collector s problem Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Statistical NLP: Lecture 7. Collocations. (Ch 5) Introduction

Statistical NLP: Lecture 7. Collocations. (Ch 5) Introduction Statistical NLP: Lecture 7 Collocations Ch 5 Introduction Collocations are characterized b limited compositionalit. Large overlap between the concepts of collocations and terms, technical term and terminological

More information

Empirical Methods for the Analysis of Algorithms

Empirical Methods for the Analysis of Algorithms Empirical Methods for the Analysis of Algorithms Marco Chiarandini 1 Luís Paquete 2 1 Department of Mathematics and Computer Science University of Southern Denmark, Odense, Denmark 2 CISUC - Centre for

More information

arxiv: v1 [cs.ne] 9 Aug 2018

arxiv: v1 [cs.ne] 9 Aug 2018 Sample size estimation for power and accuracy in the experimental comparison of algorithms Felipe Campelo 1,2 and Fernanda Takahashi 1,3 arxiv:1808.02997v1 [cs.ne] 9 Aug 2018 1 Operations Research and

More information

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0 Analysis of variance (ANOVA) ANOVA Comparing the means of more than two groups Like a t-test, but can compare more than two groups Asks whether any of two or more means is different from any other. In

More information

A3. Statistical Inference

A3. Statistical Inference Appendi / A3. Statistical Inference / Mean, One Sample-1 A3. Statistical Inference Population Mean μ of a Random Variable with known standard deviation σ, and random sample of size n 1 Before selecting

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Much of statistical inference centers around the ability to distinguish between two or more groups in terms of some underlying response variable y. Sometimes, there are but

More information

STATISTICS 4, S4 (4769) A2

STATISTICS 4, S4 (4769) A2 (4769) A2 Objectives To provide students with the opportunity to explore ideas in more advanced statistics to a greater depth. Assessment Examination (72 marks) 1 hour 30 minutes There are four options

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Chapter 5 Introduction to Factorial Designs

Chapter 5 Introduction to Factorial Designs Chapter 5 Introduction to Factorial Designs 5. Basic Definitions and Principles Stud the effects of two or more factors. Factorial designs Crossed: factors are arranged in a factorial design Main effect:

More information

ADDITIONAL STATISTICAL ANALYSES. The data were not normally distributed (Kolmogorov-Smirnov test; Legendre &

ADDITIONAL STATISTICAL ANALYSES. The data were not normally distributed (Kolmogorov-Smirnov test; Legendre & DDITIONL STTISTICL NLYSES The data were not normally distributed (Kolmogorov-Smirnov test; Legendre & Legendre, 1998) and violate the assumption of equal variances (Levene test; Rohlf & Sokal, 1994) for

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test

More information

Section 3 : Permutation Inference

Section 3 : Permutation Inference Section 3 : Permutation Inference Fall 2014 1/39 Introduction Throughout this slides we will focus only on randomized experiments, i.e the treatment is assigned at random We will follow the notation of

More information

Sequential Analysis & Testing Multiple Hypotheses,

Sequential Analysis & Testing Multiple Hypotheses, Sequential Analysis & Testing Multiple Hypotheses, CS57300 - Data Mining Spring 2016 Instructor: Bruno Ribeiro 2016 Bruno Ribeiro This Class: Sequential Analysis Testing Multiple Hypotheses Nonparametric

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different

More information

MATH Notebook 3 Spring 2018

MATH Notebook 3 Spring 2018 MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA

More information

http://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences

More information

Physics Lab 1 - Measurements

Physics Lab 1 - Measurements Phsics 203 - Lab 1 - Measurements Introduction An phsical science requires measurement. This lab will involve making several measurements of the fundamental units of length, mass, and time. There is no

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61

More information

Things you always wanted to know about statistics but were afraid to ask

Things you always wanted to know about statistics but were afraid to ask Things you always wanted to know about statistics but were afraid to ask Christoph Amma Felix Putze Design and Evaluation of Innovative User Interfaces 6.12.13 1/43 Overview In the last lecture, we learned

More information

Lecture 5: Two-point Sampling

Lecture 5: Two-point Sampling Randomized Algorithms Lecture 5: Two-point Sampling Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized Algorithms - Lecture 5 1 / 26 Overview A. Pairwise

More information

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model Biostatistics 250 ANOVA Multiple Comparisons 1 ORIGIN 1 Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model When the omnibus F-Test for ANOVA rejects the null hypothesis that

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Confidence Intervals for Population Mean

Confidence Intervals for Population Mean Confidence Intervals for Population Mean Reading: Sections 7.1, 7.2, 7.3 Learning Objectives: Students should be able to: Understand the meaning and purpose of confidence intervals Calculate a confidence

More information

Chapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).

Chapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference:

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) 1. A quick and easy indicator of dispersion is a. Arithmetic mean b. Variance c. Standard deviation

More information

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption Summary: the confidence interval for the mean (σ known) with gaussian assumption on X Let X be a Gaussian r.v. with mean µ and variance σ. If X 1, X,..., X n is a random sample drawn from X then the confidence

More information

Multiple Sample Numerical Data

Multiple Sample Numerical Data Multiple Sample Numerical Data Analysis of Variance, Kruskal-Wallis test, Friedman test University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 /

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

Inference about the Slope and Intercept

Inference about the Slope and Intercept Inference about the Slope and Intercept Recall, we have established that the least square estimates and 0 are linear combinations of the Y i s. Further, we have showed that the are unbiased and have the

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19 additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA

More information

P 0 (x 0, y 0 ), P 1 (x 1, y 1 ), P 2 (x 2, y 2 ), P 3 (x 3, y 3 ) The goal is to determine a third degree polynomial of the form,

P 0 (x 0, y 0 ), P 1 (x 1, y 1 ), P 2 (x 2, y 2 ), P 3 (x 3, y 3 ) The goal is to determine a third degree polynomial of the form, Bezier Curves While working for the Renault automobile compan in France, an engineer b the name of P. Bezier developed a sstem for designing car bodies based partl on some fairl straightforward mathematics.

More information