Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test
|
|
- Clifton Merritt
- 5 years ago
- Views:
Transcription
1 Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test Stefan Hasselgren Niels Bohr Institute March 9, 2017 Slide 1/11
2 Bias of Kolmogorov g.o.f. test Draw a sample X 1,..., X n with unknown d.f. F. Based on these, we want to test the hypothesis where F 0 is a fixed d.f. H 0 : F = F 0, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 2/11
3 A test is unbiased if the probability of rejecting the null hypothesis Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11
4 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11
5 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true and (b) is less than (or equal to) the significance level when the null hypothesis is true. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11
6 A test is unbiased if the probability of rejecting the null hypothesis (a) is greater than (or equal to) the significance level when the alternative is true and (b) is less than (or equal to) the significance level when the null hypothesis is true. The test is biased for the alternative hypothesis, if (a) is not true while (b) is still true. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 3/11
7 Now consider a test, which has the following properties: Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
8 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
9 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
10 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
11 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
12 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. 2 The test is distribution free. Essentially, no assumptions are made about the underlying distribution of the sample. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
13 Now consider a test, which has the following properties: 1 Reject the null hypothesis if d(g n, F 0 ) > δ α. G n is the sample d.f. of the sample X 1,..., X n, d(g n, F 0 ) is the distance in the space of d.f. s. δ α is a distance associated with the significance level α. 2 The test is distribution free. Essentially, no assumptions are made about the underlying distribution of the sample. 3 Such a test is called a "distance-based test." Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 4/11
14 Now, let s get technical. For some distribution function F ; Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11
15 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11
16 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11
17 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), then make a ball in this space with radius δ > 0 and centre at F. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11
18 Now, let s get technical. For some distribution function F ; take all the d.f. s with distance d from F (so all F i where d(f i, F ) = d), put them in a metric space (somehow), then make a ball in this space with radius δ > 0 and centre at F. Label this ball B(F, δ). Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 5/11
19 Now we take a d.f. F 0, and then suppose that (for some α > 0) there exists a d.f. F a, such that the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 6/11
20 Now we take a d.f. F 0, and then suppose that (for some α > 0) there exists a d.f. F a, such that the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), and there is a non-zero probability of the sample d.f. being in B(F 0, δ α ), but not in B(F a, δ α ), given that F a is the true d.f. Then the distance-based test is biased for the alternative hypothesis F a. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 6/11
21 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
22 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
23 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
24 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
25 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
26 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
27 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
28 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
29 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, which is the same as P Fa {d(g n, F 0 ) > δ α } < α. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
30 Proof The proof is fairly simple. Start by taking a sample X 1,..., X n from F a. This sample has sample d.f. G n. Then P Fa {G n B(F a, δ α )} 1 α. But since the ball B(F a, δ α ) is strictly inside the ball B(F 0, δ α ), then P Fa {G n B(F 0, δ α )} > 1 α, which is the same as P Fa {d(g n, F 0 ) > δ α } < α. But this is strictly against the demand (a) for an unbiased test! And so the distance-based test is biased for the alternative F a. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 7/11
31 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11
32 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. We can then easily choose or assume F and G to satisfy the conditions above. The article proves, that for equal sample size n = m, the KS two-sample test is unbiased. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11
33 Bias of the KS two-sample test for different sample sizes Take two samples X 1,..., X m from F and Y 1,..., Y n from G. The null hypothesis is then H 0 : F = G. We can then easily choose or assume F and G to satisfy the conditions above. The article proves, that for equal sample size n = m, the KS two-sample test is unbiased. However, for sufficiently large nm, the KS test is biased for the alternative F = F a G, since in the limit for m we obtain the Kolmogorov g.o.f. test. Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 8/11
34 Thank you! Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 9/11
35 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 10/11
36 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 10/11
37 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 11/11
38 Example Stefan Hasselgren (Niels Bohr Institute) Adv. App. Stat. Presentation Slide 11/11
Simple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationH 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that
Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution
More informationMATH 240. Chapter 8 Outlines of Hypothesis Tests
MATH 4 Chapter 8 Outlines of Hypothesis Tests Test for Population Proportion p Specify the null and alternative hypotheses, ie, choose one of the three, where p is some specified number: () H : p H : p
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationCOSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan
COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Topic Distribution of means When it is needed How to build one (from scratch) Determining the characteristics
More informationDifference between means - t-test /25
Difference between means - t-test 1 Discussion Question p492 Ex 9-4 p492 1-3, 6-8, 12 Assume all variances are not equal. Ignore the test for variance. 2 Students will perform hypothesis tests for two
More informationJames H. Steiger. Department of Psychology and Human Development Vanderbilt University. Introduction Factors Influencing Power
Department of Psychology and Human Development Vanderbilt University 1 2 3 4 5 In the preceding lecture, we examined hypothesis testing as a general strategy. Then, we examined how to calculate critical
More information10.2: The Chi Square Test for Goodness of Fit
10.2: The Chi Square Test for Goodness of Fit We can perform a hypothesis test to determine whether the distribution of a single categorical variable is following a proposed distribution. We call this
More informationPOLI 443 Applied Political Research
POLI 443 Applied Political Research Session 4 Tests of Hypotheses The Normal Curve Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College
More informationPOLI 443 Applied Political Research
POLI 443 Applied Political Research Session 6: Tests of Hypotheses Contingency Analysis Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationEvaluating Hypotheses
Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,
More informationSection 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples
Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means
More informationLecture 2: CDF and EDF
STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all
More informationCh. 11 Inference for Distributions of Categorical Data
Ch. 11 Inference for Distributions of Categorical Data CH. 11 2 INFERENCES FOR RELATIONSHIPS The two sample z procedures from Ch. 10 allowed us to compare proportions of successes in two populations or
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More informationSection 3 : Permutation Inference
Section 3 : Permutation Inference Fall 2014 1/39 Introduction Throughout this slides we will focus only on randomized experiments, i.e the treatment is assigned at random We will follow the notation of
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationChapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions
Chapter 9 Inferences from Two Samples 9. Inferences About Two Proportions 9.3 Inferences About Two s (Independent) 9.4 Inferences About Two s (Matched Pairs) 9.5 Comparing Variation in Two Samples Objective
More informationLecture 28 Chi-Square Analysis
Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not
More informationSTAT 515 fa 2016 Lec Statistical inference - hypothesis testing
STAT 515 fa 2016 Lec 20-21 Statistical inference - hypothesis testing Karl B. Gregory Wednesday, Oct 12th Contents 1 Statistical inference 1 1.1 Forms of the null and alternate hypothesis for µ and p....................
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data
ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric
More information9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.
Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences
More informationSection 3: Permutation Inference
Section 3: Permutation Inference Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A September 26, 2015 1 / 47 Introduction Throughout this slides we will focus only on randomized experiments, i.e
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More information-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).
Chemometrics Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information. -However, this definition can be expanded to include: biology (biometrics), environmental
More information2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006
and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association
More information:the actual population proportion are equal to the hypothesized sample proportions 2. H a
AP Statistics Chapter 14 Chi- Square Distribution Procedures I. Chi- Square Distribution ( χ 2 ) The chi- square test is used when comparing categorical data or multiple proportions. a. Family of only
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationAUTOCORRELATION. Phung Thanh Binh
AUTOCORRELATION Phung Thanh Binh OUTLINE Time series Gauss-Markov conditions The nature of autocorrelation Causes of autocorrelation Consequences of autocorrelation Detecting autocorrelation Remedial measures
More informationReliability of inference (1 of 2 lectures)
Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationf (1 0.5)/n Z =
Math 466/566 - Homework 4. We want to test a hypothesis involving a population proportion. The unknown population proportion is p. The null hypothesis is p = / and the alternative hypothesis is p > /.
More information16.3 One-Way ANOVA: The Procedure
16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom
More informationStatistical Methods for Astronomy
Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial
More informationLast two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More information10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing
& z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could
More informationChapter 9: Sampling Distributions
Chapter 9: Sampling Distributions 1 Activity 9A, pp. 486-487 2 We ve just begun a sampling distribution! Strictly speaking, a sampling distribution is: A theoretical distribution of the values of a statistic
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationLast week: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling
More informationLecture on Null Hypothesis Testing & Temporal Correlation
Lecture on Null Hypothesis Testing & Temporal Correlation CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Acknowledgement Resources used in the slides
More informationFVA Analysis and Forecastability
FVA Analysis and Forecastability Michael Gilliland, CFPIM Product Marketing Manager - Forecasting SAS About SAS World s largest private software company $2.43 billion revenue in 2010 50,000 customer sites
More information9.5 t test: one μ, σ unknown
GOALS: 1. Recognize the assumptions for a 1 mean t test (srs, nd or large sample size, population stdev. NOT known). 2. Understand that the actual p value (area in the tail past the test statistic) is
More informationQuantitative Techniques - Lecture 8: Estimation
Quantitative Techniques - Lecture 8: Estimation Key words: Estimation, hypothesis testing, bias, e ciency, least squares Hypothesis testing when the population variance is not known roperties of estimates
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationNon-Parametric Two-Sample Analysis: The Mann-Whitney U Test
Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, non-parametric tests can
More informationEconometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012
Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation
More informationParameter Estimation, Sampling Distributions & Hypothesis Testing
Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which
More informationLecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power
Lecture 06 DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power Introduction Assumptions When broken then we are not able to make inference or accurate descriptions about
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationWe know from STAT.1030 that the relevant test statistic for equality of proportions is:
2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which
More informationCHAPTER 6: SPECIFICATION VARIABLES
Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero
More informationA Practitioner s Guide to Cluster-Robust Inference
A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationStochastic Simulation
Stochastic Simulation APPM 7400 Lesson 3: Testing Random Number Generators Part II: Uniformity September 5, 2018 Lesson 3: Testing Random Number GeneratorsPart II: Uniformity Stochastic Simulation September
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationDesign of Engineering Experiments
Design of Engineering Experiments Hussam Alshraideh Chapter 2: Some Basic Statistical Concepts October 4, 2015 Hussam Alshraideh (JUST) Basic Stats October 4, 2015 1 / 29 Overview 1 Introduction Basic
More informationThe Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing
1 The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing Greene Ch 4, Kennedy Ch. R script mod1s3 To assess the quality and appropriateness of econometric estimators, we
More informationKeppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares
Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares K&W introduce the notion of a simple experiment with two conditions. Note that the raw data (p. 16)
More informationEconometrics Part Three
!1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases
More informationStat 231 Exam 2 Fall 2013
Stat 231 Exam 2 Fall 2013 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Some IE 361 students worked with a manufacturer on quantifying the capability
More informationGoodness-of-fit Tests for the Normal Distribution Project 1
Goodness-of-fit Tests for the Normal Distribution Project 1 Jeremy Morris September 29, 2005 1 Kolmogorov-Smirnov Test The Kolmogorov-Smirnov Test (KS test) is based on the cumulative distribution function
More informationMultiple Regression Analysis: Heteroskedasticity
Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing
More informationHypothesis Testing. Week 04. Presented by : W. Rofianto
Hypothesis Testing Week 04 Presented by : W. Rofianto Tests about a Population Mean: σ unknown Test Statistic t x 0 s / n This test statistic has a t distribution with n - 1 degrees of freedom. Example:
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationSTAT 830 Hypothesis Testing
STAT 830 Hypothesis Testing Richard Lockhart Simon Fraser University STAT 830 Fall 2018 Richard Lockhart (Simon Fraser University) STAT 830 Hypothesis Testing STAT 830 Fall 2018 1 / 30 Purposes of These
More informationExact Statistical Inference in. Parametric Models
Exact Statistical Inference in Parametric Models Audun Sektnan December 2016 Specialization Project Department of Mathematical Sciences Norwegian University of Science and Technology Supervisor: Professor
More informationA note on Devaney's definition of chaos
University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2004 A note on Devaney's definition of chaos Joseph
More informationEconometrics Review questions for exam
Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =
More informationModule 5 Practice problem and Homework answers
Module 5 Practice problem and Homework answers Practice problem What is the mean for the before period? Answer: 5.3 x = 74.1 14 = 5.3 What is the mean for the after period? Answer: 6.8 x = 95.3 14 = 6.8
More informationLinear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1
Linear Regression Machine Learning CSE546 Kevin Jamieson University of Washington Oct 5, 2017 1 The regression problem Given past sales data on zillow.com, predict: y = House sale price from x = {# sq.
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationSolution to Tutorial 7
1. (a) We first fit the independence model ST3241 Categorical Data Analysis I Semester II, 2012-2013 Solution to Tutorial 7 log µ ij = λ + λ X i + λ Y j, i = 1, 2, j = 1, 2. The parameter estimates are
More informationStats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16
Stats Review Chapter 14 Revised 8/16 Note: This review is meant to highlight basic concepts from the course. It does not cover all concepts presented by your instructor. Refer back to your notes, unit
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More information1/11/2011. Chapter 4: Variability. Overview
Chapter 4: Variability Overview In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In simple terms, if the scores in a distribution are all
More informationFrequentist Statistics and Hypothesis Testing Spring
Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationHyperspacings and the Estimation of Information Theoretic Quantities
Hyperspacings and the Estimation of Information Theoretic Quantities Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 0003 elm@cs.umass.edu Abstract
More informationChapter 26 Multiple Regression, Logistic Regression, and Indicator Variables
Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationEcon 3790: Business and Economic Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in
More informationEconometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 11 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 30 Recommended Reading For the today Advanced Time Series Topics Selected topics
More informationWednesday, October 17 Handout: Hypothesis Testing and the Wald Test
Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test Preview No Money Illusion Theory: Calculating True] o Clever Algebraic
More informationForefront of the Two Sample Problem
Forefront of the Two Sample Problem From classical to state-of-the-art methods Yuchi Matsuoka What is the Two Sample Problem? X ",, X % ~ P i. i. d Y ",, Y - ~ Q i. i. d Two Sample Problem P = Q? Example:
More informationSTATISTICS ASSIGNMENT 2
STATISTICS ASSIGNMENT 2 Matteo Sostero 815831 June 10, 2010 Introduction The following document is a brief statistical report as part of the second assignment. It covers the issues raised on a dataset
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationPaper Reference R. Statistics S4 Advanced/Advanced Subsidiary. Friday 21 June 2013 Morning Time: 1 hour 30 minutes
Centre No. Candidate No. Paper Reference(s) 6686/01R Edexcel GCE Statistics S4 Advanced/Advanced Subsidiary Friday 21 June 2013 Morning Time: 1 hour 30 minutes Materials required for examination Mathematical
More informationSTAT 830 Hypothesis Testing
STAT 830 Hypothesis Testing Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple
More informationChapter 11 - Lecture 1 Single Factor ANOVA
April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More information