Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test

Size: px
Start display at page:

Download "Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test"

Transcription

1 Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test Stefan Hasselgren Niels Bohr Institute March 9, 2017 Slide 1/11

2 Bias of Kolmogorov g.o.f. test Draw a sample X 1,..., X n with unknown d.f. F. Based on these, we want to test the hypothesis where F 0 is a fixed d.f. H 0 : F = F 0, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 2/11

3 A test is unbiased if the probability of rejecting the null hypothesis Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11

4 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11

5 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true and (b) is less than (or equal to) the significance level when the null hypothesis is true. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11

6 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true and (b) is less than (or equal to) the significance level when the null hypothesis is true. The test is biased for the alternative hypothesis, if (a) is not true while (b) is still true. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11

7 Now consider a test, which has the following properties: Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

8 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

9 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

10 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

11 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

12 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. 2 The test is distribution free. Essentially, no assumptions are made about the underlying distribution of the sample. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

13 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. 2 The test is distribution free. Essentially, no assumptions are made about the underlying distribution of the sample. 3 Such a test is called a "distance-based test." Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11

14 Now, let s get technical. For some distribution function F ; Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11

15 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11

16 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11

17 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), then make a ball in this space with radius δ > 0 and centre at F. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11

18 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), then make a ball in this space with radius δ > 0 and centre at F. Label this ball B(F, δ). Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11

19 Now we take a d.f. F 0, and then suppose that (for some α > 0) there exists a d.f. F a, such that the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 6/11

20 Now we take a d.f. F 0, and then suppose that (for some α > 0) there exists a d.f. F a, such that the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), and there is a non-zero probability of the sample d.f. being in B(F 0, δ α ), but not in B(F a, δ α ), given that F a is the true d.f. Then the distance-based test is biased for the alternative hypothesis F a. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 6/11

21 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

22 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

23 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

24 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

25 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

26 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

27 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

28 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

29 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, which is the same as P Fa {d(g n, F 0 ) > δ α } < α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

30 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, which is the same as P Fa {d(g n, F 0 ) > δ α } < α. But this is strictly against the demand (a) for an unbiased test! And so the distance-based test is biased for the alternative F a. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11

31 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11

32 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. We can then easily choose or assume F and G to satisfy the conditions above. The article proves, that for equal sample size n = m, the KS two-sample test is unbiased. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11

33 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. We can then easily choose or assume F and G to satisfy the conditions above. The article proves, that for equal sample size n = m, the KS two-sample test is unbiased. However, for sufficiently large nm, the KS test is biased for the alternative F = F a G, since in the limit for m we obtain the Kolmogorov g.o.f. test. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11

34 Thank you! Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 9/11

35 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 10/11

36 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 10/11

37 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 11/11

38 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 11/11

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

MATH 240. Chapter 8 Outlines of Hypothesis Tests

MATH 240. Chapter 8 Outlines of Hypothesis Tests MATH 4 Chapter 8 Outlines of Hypothesis Tests Test for Population Proportion p Specify the null and alternative hypotheses, ie, choose one of the three, where p is some specified number: () H : p H : p

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Topic Distribution of means When it is needed How to build one (from scratch) Determining the characteristics

More information

Difference between means - t-test /25

Difference between means - t-test /25 Difference between means - t-test 1 Discussion Question p492 Ex 9-4 p492 1-3, 6-8, 12 Assume all variances are not equal. Ignore the test for variance. 2 Students will perform hypothesis tests for two

More information

James H. Steiger. Department of Psychology and Human Development Vanderbilt University. Introduction Factors Influencing Power

James H. Steiger. Department of Psychology and Human Development Vanderbilt University. Introduction Factors Influencing Power Department of Psychology and Human Development Vanderbilt University 1 2 3 4 5 In the preceding lecture, we examined hypothesis testing as a general strategy. Then, we examined how to calculate critical

More information

10.2: The Chi Square Test for Goodness of Fit

10.2: The Chi Square Test for Goodness of Fit 10.2: The Chi Square Test for Goodness of Fit We can perform a hypothesis test to determine whether the distribution of a single categorical variable is following a proposed distribution. We call this

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 4 Tests of Hypotheses The Normal Curve Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 6: Tests of Hypotheses Contingency Analysis Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

Evaluating Hypotheses

Evaluating Hypotheses Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

Lecture 2: CDF and EDF

Lecture 2: CDF and EDF STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all

More information

Ch. 11 Inference for Distributions of Categorical Data

Ch. 11 Inference for Distributions of Categorical Data Ch. 11 Inference for Distributions of Categorical Data CH. 11 2 INFERENCES FOR RELATIONSHIPS The two sample z procedures from Ch. 10 allowed us to compare proportions of successes in two populations or

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Section 3 : Permutation Inference

Section 3 : Permutation Inference Section 3 : Permutation Inference Fall 2014 1/39 Introduction Throughout this slides we will focus only on randomized experiments, i.e the treatment is assigned at random We will follow the notation of

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions Chapter 9 Inferences from Two Samples 9. Inferences About Two Proportions 9.3 Inferences About Two s (Independent) 9.4 Inferences About Two s (Matched Pairs) 9.5 Comparing Variation in Two Samples Objective

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing STAT 515 fa 2016 Lec 20-21 Statistical inference - hypothesis testing Karl B. Gregory Wednesday, Oct 12th Contents 1 Statistical inference 1 1.1 Forms of the null and alternate hypothesis for µ and p....................

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Section 3: Permutation Inference

Section 3: Permutation Inference Section 3: Permutation Inference Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A September 26, 2015 1 / 47 Introduction Throughout this slides we will focus only on randomized experiments, i.e

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics). Chemometrics Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information. -However, this definition can be expanded to include: biology (biometrics), environmental

More information

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006 and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association

More information

:the actual population proportion are equal to the hypothesized sample proportions 2. H a

:the actual population proportion are equal to the hypothesized sample proportions 2. H a AP Statistics Chapter 14 Chi- Square Distribution Procedures I. Chi- Square Distribution ( χ 2 ) The chi- square test is used when comparing categorical data or multiple proportions. a. Family of only

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

AUTOCORRELATION. Phung Thanh Binh

AUTOCORRELATION. Phung Thanh Binh AUTOCORRELATION Phung Thanh Binh OUTLINE Time series Gauss-Markov conditions The nature of autocorrelation Causes of autocorrelation Consequences of autocorrelation Detecting autocorrelation Remedial measures

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

f (1 0.5)/n Z =

f (1 0.5)/n Z = Math 466/566 - Homework 4. We want to test a hypothesis involving a population proportion. The unknown population proportion is p. The null hypothesis is p = / and the alternative hypothesis is p > /.

More information

16.3 One-Way ANOVA: The Procedure

16.3 One-Way ANOVA: The Procedure 16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing & z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could

More information

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions Chapter 9: Sampling Distributions 1 Activity 9A, pp. 486-487 2 We ve just begun a sampling distribution! Strictly speaking, a sampling distribution is: A theoretical distribution of the values of a statistic

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Lecture on Null Hypothesis Testing & Temporal Correlation

Lecture on Null Hypothesis Testing & Temporal Correlation Lecture on Null Hypothesis Testing & Temporal Correlation CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Acknowledgement Resources used in the slides

More information

FVA Analysis and Forecastability

FVA Analysis and Forecastability FVA Analysis and Forecastability Michael Gilliland, CFPIM Product Marketing Manager - Forecasting SAS About SAS World s largest private software company $2.43 billion revenue in 2010 50,000 customer sites

More information

9.5 t test: one μ, σ unknown

9.5 t test: one μ, σ unknown GOALS: 1. Recognize the assumptions for a 1 mean t test (srs, nd or large sample size, population stdev. NOT known). 2. Understand that the actual p value (area in the tail past the test statistic) is

More information

Quantitative Techniques - Lecture 8: Estimation

Quantitative Techniques - Lecture 8: Estimation Quantitative Techniques - Lecture 8: Estimation Key words: Estimation, hypothesis testing, bias, e ciency, least squares Hypothesis testing when the population variance is not known roperties of estimates

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, non-parametric tests can

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which

More information

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power Lecture 06 DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power Introduction Assumptions When broken then we are not able to make inference or accurate descriptions about

More information

Table 1: Fish Biomass data set on 26 streams

Table 1: Fish Biomass data set on 26 streams Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain

More information

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

We know from STAT.1030 that the relevant test statistic for equality of proportions is: 2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation APPM 7400 Lesson 3: Testing Random Number Generators Part II: Uniformity September 5, 2018 Lesson 3: Testing Random Number GeneratorsPart II: Uniformity Stochastic Simulation September

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Design of Engineering Experiments

Design of Engineering Experiments Design of Engineering Experiments Hussam Alshraideh Chapter 2: Some Basic Statistical Concepts October 4, 2015 Hussam Alshraideh (JUST) Basic Stats October 4, 2015 1 / 29 Overview 1 Introduction Basic

More information

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing 1 The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing Greene Ch 4, Kennedy Ch. R script mod1s3 To assess the quality and appropriateness of econometric estimators, we

More information

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares K&W introduce the notion of a simple experiment with two conditions. Note that the raw data (p. 16)

More information

Econometrics Part Three

Econometrics Part Three !1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases

More information

Stat 231 Exam 2 Fall 2013

Stat 231 Exam 2 Fall 2013 Stat 231 Exam 2 Fall 2013 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Some IE 361 students worked with a manufacturer on quantifying the capability

More information

Goodness-of-fit Tests for the Normal Distribution Project 1

Goodness-of-fit Tests for the Normal Distribution Project 1 Goodness-of-fit Tests for the Normal Distribution Project 1 Jeremy Morris September 29, 2005 1 Kolmogorov-Smirnov Test The Kolmogorov-Smirnov Test (KS test) is based on the cumulative distribution function

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Hypothesis Testing. Week 04. Presented by : W. Rofianto

Hypothesis Testing. Week 04. Presented by : W. Rofianto Hypothesis Testing Week 04 Presented by : W. Rofianto Tests about a Population Mean: σ unknown Test Statistic t x 0 s / n This test statistic has a t distribution with n - 1 degrees of freedom. Example:

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

STAT 830 Hypothesis Testing

STAT 830 Hypothesis Testing STAT 830 Hypothesis Testing Richard Lockhart Simon Fraser University STAT 830 Fall 2018 Richard Lockhart (Simon Fraser University) STAT 830 Hypothesis Testing STAT 830 Fall 2018 1 / 30 Purposes of These

More information

Exact Statistical Inference in. Parametric Models

Exact Statistical Inference in. Parametric Models Exact Statistical Inference in Parametric Models Audun Sektnan December 2016 Specialization Project Department of Mathematical Sciences Norwegian University of Science and Technology Supervisor: Professor

More information

A note on Devaney's definition of chaos

A note on Devaney's definition of chaos University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2004 A note on Devaney's definition of chaos Joseph

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Module 5 Practice problem and Homework answers

Module 5 Practice problem and Homework answers Module 5 Practice problem and Homework answers Practice problem What is the mean for the before period? Answer: 5.3 x = 74.1 14 = 5.3 What is the mean for the after period? Answer: 6.8 x = 95.3 14 = 6.8

More information

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1 Linear Regression Machine Learning CSE546 Kevin Jamieson University of Washington Oct 5, 2017 1 The regression problem Given past sales data on zillow.com, predict: y = House sale price from x = {# sq.

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Solution to Tutorial 7

Solution to Tutorial 7 1. (a) We first fit the independence model ST3241 Categorical Data Analysis I Semester II, 2012-2013 Solution to Tutorial 7 log µ ij = λ + λ X i + λ Y j, i = 1, 2, j = 1, 2. The parameter estimates are

More information

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16 Stats Review Chapter 14 Revised 8/16 Note: This review is meant to highlight basic concepts from the course. It does not cover all concepts presented by your instructor. Refer back to your notes, unit

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

1/11/2011. Chapter 4: Variability. Overview

1/11/2011. Chapter 4: Variability. Overview Chapter 4: Variability Overview In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In simple terms, if the scores in a distribution are all

More information

Frequentist Statistics and Hypothesis Testing Spring

Frequentist Statistics and Hypothesis Testing Spring Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Hyperspacings and the Estimation of Information Theoretic Quantities

Hyperspacings and the Estimation of Information Theoretic Quantities Hyperspacings and the Estimation of Information Theoretic Quantities Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 0003 elm@cs.umass.edu Abstract

More information

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 11 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 30 Recommended Reading For the today Advanced Time Series Topics Selected topics

More information

Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test

Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test Preview No Money Illusion Theory: Calculating True] o Clever Algebraic

More information

Forefront of the Two Sample Problem

Forefront of the Two Sample Problem Forefront of the Two Sample Problem From classical to state-of-the-art methods Yuchi Matsuoka What is the Two Sample Problem? X ",, X % ~ P i. i. d Y ",, Y - ~ Q i. i. d Two Sample Problem P = Q? Example:

More information

STATISTICS ASSIGNMENT 2

STATISTICS ASSIGNMENT 2 STATISTICS ASSIGNMENT 2 Matteo Sostero 815831 June 10, 2010 Introduction The following document is a brief statistical report as part of the second assignment. It covers the issues raised on a dataset

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Paper Reference R. Statistics S4 Advanced/Advanced Subsidiary. Friday 21 June 2013 Morning Time: 1 hour 30 minutes

Paper Reference R. Statistics S4 Advanced/Advanced Subsidiary. Friday 21 June 2013 Morning Time: 1 hour 30 minutes Centre No. Candidate No. Paper Reference(s) 6686/01R Edexcel GCE Statistics S4 Advanced/Advanced Subsidiary Friday 21 June 2013 Morning Time: 1 hour 30 minutes Materials required for examination Mathematical

More information

STAT 830 Hypothesis Testing

STAT 830 Hypothesis Testing STAT 830 Hypothesis Testing Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two

More information

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple

More information

Chapter 11 - Lecture 1 Single Factor ANOVA

Chapter 11 - Lecture 1 Single Factor ANOVA April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information