Data Processing Techniques

Size: px
Start display at page:

Download "Data Processing Techniques"

Transcription

1 Universitas Gadjah Mada Department of Civil and Environmental Engineering Master in Engineering in Natural Disaster Management Data Processing Techniques Hypothesis Tes,ng 1

2 Hypothesis Testing Mathema,cal model vs measurement. Comparison of theore,cal line (computed by model) and measured values. If computed values match with measured ones, the model is accepted. If computed values do not fit to measured ones, the model is rejected. We have in many cases Comparison of the computed and measured values cannot give clear clue whether to accept or to reject the model. Hypothesis tes,ng provides an analysis tool in the comparison.

3 Hypothesis Testing Steps in making sta,s,cal tests Formulate the hypothesis to be tested. Formulate an alterna,ve hypothesis. Define a test sta,s,c. Define the distribu,on of the test sta,s,c. Define the rejec,on region or cri,cal region of the test sta,s,c. Collect the data needed to calculate the test sta,s,c. Determine if the calculated value of the test sta,s,c falls in the rejec,on region of the distribu,on of the test sta,s,c. 3

4 Errors in Hypothesis Testing decision hypothesis is true hypothesis is false accept hypothesis correct decision Type II error reject hypothesis Type I error correct decision 4

5 Notation H 0 = null hypothesis (hypothesis being tested) H 1 = alterna,ve hypothesis 1 α = confidence level α = level of significance 5

6 Hypothesis Testing on Mean H 0 : μ = μ 1 H 1 : μ = μ Test sta,s,c: Z = Normal distribu,on σ X is known If μ 1 > μ à H 0 is rejected if: If μ > μ 1 à H 0 is rejected if: n ( X µ σ 1 ) has a normal distribu,on. X X µ 1 z 1 α σ X n X µ 1 + z 1 α σ X n Z z 1 α - z 1- α 1 α α Z z 1 α z 1- α α 1 α 6

7 Hypothesis Testing on Mean H 0 : μ = μ 1 H 1 : μ = μ Test sta,s,c: Normal distribu,on σ X is unknown T = If μ 1 > μ à H 0 is rejected if:! T t 1 α,n 1 If μ > μ 1 à H 0 is rejected if:! T t 1 α,n 1 n ( X µ s 1 ) X has a t distribu,on with n 1 degrees of freedom. α - t 1- α,n- 1 1 α 1 α α t 1- α,n- 1 7

8 Hypothesis Testing on Mean H 0 : μ = μ 0 H 1 : μ μ 0 Normal distribu,on σ X is known Test sta,s,c: Z = n ( X µ has a normal distribu,on. σ 0 ) X H 0 is rejected if: Z = n σ X X µ 0 ( ) > z 1 α α/ - z 1- α/ 1 α α/ z 1- α/ 8

9 Hypothesis Testing on Mean H 0 : μ = μ 0 H 1 : μ μ 0 Test sta,s,c: Normal distribu,on σ X is unknown T = H 0 is rejected if: n ( X µ s 0 ) X T = has a t distribu,on with n 1 degrees of freedom. n ( X µ s 0 ) > t 1 α,n 1 X α/ 1 α α/ - t 1- α/,n- 1 t 1- α/,n- 1 9

10 Hypothesis Testing on Mean Result of a hypothesis tes,ng accept H 0 fail to reject H 0 Meaning H 0 : μ = μ 0 Accep,ng H 0 means that we fail to reject H 0 à we say that based on the sample that we have, we say that the popula,on mean is not significantly different from μ 0 we cannot say that the popula,on mean really equals to μ 0 since we do not prove that μ = μ 0 10

11 Test for Differences in Means of Two Normal Distributions H 0 : μ 1 μ = δ H 1 : μ 1 μ δ Test sta,s,c: Z = H 0 is rejected if: Z > z 1 α Normal distribu,on var(x 1 ) and var(x ) are known X 1 X δ ( σ 1 n 1 + σ n ) 1 has a standard normal distribu,on. α/ - z 1- α/ 1 α α/ z 1- α/ 11

12 Test for Differences in Means of Two Normal Distributions H 0 : μ 1 μ = δ H 1 : μ 1 μ δ Test sta,s,c: T = H 0 is rejected if: Normal distribu,on var(x 1 ) and var(x ) are unknown X 1 X δ ( )s 1 + ( n 1)s n 1 n n 1 + n { } ( n 1 + n ) n 1 1 ( ) 1 has a t distribu,on with n 1 +n degrees of freedom. T > t 1 α,n1 +n α/ 1 α α/ - t 1- α/,n1+n- t 1- α/,n1+n- 1

13 Test on Variance H 0 : σ = σ 0 H 1 : σ σ 0 Test sta,s,c: H 0 is accepted if: χ c = Normal distribu,on n i=1 χ α,n 1 ( X i X ) σ 0 < χ c < χ 1 α,n 1 has a chi- square distribu,on. α/ 1 α χ α,n 1 χ 1 α,n 1 α/ 13

14 Test on Variance of Two Normal Distributions H 0 : σ 1 = σ H 1 : σ 1 σ Test sta,s,c: H 0 is rejected if: Normal distribu,on F c = s 1 s has an F distribu,on with n 1 1 and n 1 degrees of freedom. F c > F 1 α,n1 1,n 1 1 α α F 1 α,n1 1,n 1 14

15 Test on Variance of Several Normal Distributions H 0 : σ 1 = σ = = σ k H 1 : σ 1 σ σ k h = 1+ N = k i=1 1 3 k 1 ( ) n i k 1 n i=1 N k Normal distribu,on Q Test sta,s,c: has a chi- square distribu,on with (k 1) degrees of freedom. h k k n Q = n H 0 is rejected if: ( ( i 1 i 1)s i k )ln ( n i=1 i=1 N k i 1)lns i i=1 Q h > χ 1 α,k 1 1 α χ 1 α,k 1 α 15

16 Hypothesis Testing Exercises Refer to the annual peak discharge of XYZ River. Test that the peak discharge of XYZ River has mean value of 650 m 3 /s and variance of 45,000 m 6 /s. Refer to file en,tled Exercises on hypothesis tes,ng.pdf Do these exercises. 16

17 Testing The Goodness of Fit of Data to Probability Distributions Graphical (and visual) methods to judge whether or not a par,cular distribu,on adequately describes a set of observa,ons: plot and compare the observed rela,ve frequency curve with the theore,cal rela,ve frequency curve plot the observed data on appropriate probability paper and judge as to whether or not the resul,ng plot is a straight line Sta,s,cal tests: chi- square goodness of fit test the Kolmogorov- Smirnov test 17

18 Annual Peak Discharge of XYZ River Rela4ve frequency 0.0 theore,cal distribu,on observed data Discharge (m 3 /s) 18

19 markers: observed data line: theore,cal distribu,on 19

20 Normal Distribution Paper 0

21 Chi- square Goodness of Fit Test Method of test Comparison between the actual number of observa,ons and the expected number of observa,ons (expected according to the distribu,on under test) that fall in the class intervals. The expected numbers are calculated by mul,plying the expected rela,ve frequency by the total number of observa,ons. The test sta,s,c is calculated from the following rela,onship: χ c = k i=1 ( O i E i ) E i 1

22 Chi- square Goodness of Fit Test The test sta,s,c is calculated from the following rela,onship: χ c = k i=1 ( O i E i ) E i where: k is the number of class intervals O i is the number of observa,ons in the ith class interval E i is the expected number of observa,ons in the ith class interval according to the distribu,on being tested χ c has a distribu,on of chi- square with (k p 1) degrees of freedom, where p is the number of parameters es,mated from the data

23 Chi- square Goodness of Fit Test The test sta,s,c is calculated from the following rela,onship: χ c = k i=1 ( O i E i ) E i The hypothesis that the data are from the specified distribu,on is rejected if: χ c > χ 1 α,k p 1 1 α α χ 1 α,k p 1 3

24 The Kolmogorov- Smirnov Test Steps in the Kolmogorov- Smirnov test: Let P X (x) be the completely specified theore,cal cumula,ve distribu,on func,on under the null hypothesis. Let S n (x) be the sample comula,ve density func,on based on n observa,ons. For any observed x, S n (x) = k/n where k is the number of observa,ons less than or equal to x. Determine the maximum devia,on, D, defined by: D = max P X (x) S n (x) If, for the chosen significance level, the observed value of D is greater than or equal to the cri,cal tabulated of the Kolmogorov- Smirnov sta,s,c, the hypothesis is rejected. Table of Kolmogorov- Smirnov test sta,s,c is available in many books on sta,s,cs. 4

25 The Kolmogorov- Smirnov Test Notes on the Kolmogorov- Smirnov test: The test can be conducted by calcula,ng the quan,,es P X (x) and S n (x) at each observed point or By plosng the data on the probability paper and and selec,ng the greatest devia,on on the probability scale of a point from the theore,cal line. The data should not be grouped for this test, i.e. plot each point of the data on the probability paper. 5

26 Chi- square Goodness of Fit Test and The Kolmogorov- Smirnov Test Exercise Do the chi- square goodness of fit test and the Kolmogorov- Smirnov test to the annual peak discharge of XYZ River against normal distribu,on. 6

27 Chi- square Goodness of Fit Test and The Kolmogorov- Smirnov Test Notes on both tests when tes,ng hydrologic frequency distribu,ons. Both tests are insensi,ve in the tails of the distribu,ons. On the other hand, the tails are important in hydrologic frequency distribu,ons. To increase sensi,vity of chi- square test The expected number of observa,ons in a class shall not be less than 3 (or 5). Define the class interval so that under the hypothesis being tested, the expected number of observa,ons in each class interval is the same. The class intervals will be of unequal width. The interval widths will be a func,on of the distribu,on being tested. 7

28 Chi- square Goodness of Fit Test and The Kolmogorov- Smirnov Test Exercise Redo the chi- square goodness of fit test and the Kolmogorov- Smirnov test to the annual peak discharge of XYZ River against normal distribu,on. Define the class intervals so that the expected number of observa,ons in each class interval is the same. 8

29 9

Linear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz

Linear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques

More information

Linear Regression and Correla/on

Linear Regression and Correla/on Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques

More information

REGRESSION AND CORRELATION ANALYSIS

REGRESSION AND CORRELATION ANALYSIS Problem 1 Problem 2 A group of 625 students has a mean age of 15.8 years with a standard devia>on of 0.6 years. The ages are normally distributed. How many students are younger than 16.2 years? REGRESSION

More information

Some Review and Hypothesis Tes4ng. Friday, March 15, 13

Some Review and Hypothesis Tes4ng. Friday, March 15, 13 Some Review and Hypothesis Tes4ng Outline Discussing the homework ques4ons from Joey and Phoebe Review of Sta4s4cal Inference Proper4es of OLS under the normality assump4on Confidence Intervals, T test,

More information

Correla'on. Keegan Korthauer Department of Sta's'cs UW Madison

Correla'on. Keegan Korthauer Department of Sta's'cs UW Madison Correla'on Keegan Korthauer Department of Sta's'cs UW Madison 1 Rela'onship Between Two Con'nuous Variables When we have measured two con$nuous random variables for each item in a sample, we can study

More information

Two sample Test. Paired Data : Δ = 0. Lecture 3: Comparison of Means. d s d where is the sample average of the differences and is the

Two sample Test. Paired Data : Δ = 0. Lecture 3: Comparison of Means. d s d where is the sample average of the differences and is the Gene$cs 300: Sta$s$cal Analysis of Biological Data Lecture 3: Comparison of Means Two sample t test Analysis of variance Type I and Type II errors Power More R commands September 23, 2010 Two sample Test

More information

Sociology 301. Hypothesis Testing + t-test for Comparing Means. Hypothesis Testing. Hypothesis Testing. Liying Luo 04.14

Sociology 301. Hypothesis Testing + t-test for Comparing Means. Hypothesis Testing. Hypothesis Testing. Liying Luo 04.14 Sociology 301 Hypothesis Testing + t-test for Comparing Means Liying Luo 04.14 Hypothesis Testing 5. State a technical decision and a substan;ve conclusion Hypothesis Testing A random sample of 100 UD

More information

A mul&scale autocorrela&on func&on for anisotropy studies

A mul&scale autocorrela&on func&on for anisotropy studies A mul&scale autocorrela&on func&on for anisotropy studies Mario Scuderi 1, M. De Domenico, H Lyberis and A. Insolia 1 Department of Physics and Astronomy & INFN Catania University ITALY DAA2011 Erice,

More information

T- test recap. Week 7. One- sample t- test. One- sample t- test 5/13/12. t = x " µ s x. One- sample t- test Paired t- test Independent samples t- test

T- test recap. Week 7. One- sample t- test. One- sample t- test 5/13/12. t = x  µ s x. One- sample t- test Paired t- test Independent samples t- test T- test recap Week 7 One- sample t- test Paired t- test Independent samples t- test T- test review Addi5onal tests of significance: correla5ons, qualita5ve data In each case, we re looking to see whether

More information

Courtesy of Jes Jørgensen

Courtesy of Jes Jørgensen Courtesy of Jes Jørgensen Testing Models 3 May 2016 Science is all about models Use physical mechanisms to predict outcomes Test the outcomes in order to test our understanding of the physics Science is

More information

Regression Part II. One- factor ANOVA Another dummy variable coding scheme Contrasts Mul?ple comparisons Interac?ons

Regression Part II. One- factor ANOVA Another dummy variable coding scheme Contrasts Mul?ple comparisons Interac?ons Regression Part II One- factor ANOVA Another dummy variable coding scheme Contrasts Mul?ple comparisons Interac?ons One- factor Analysis of variance Categorical Explanatory variable Quan?ta?ve Response

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Used for comparing or more means an extension of the t test Independent Variable (factor) = categorical (qualita5ve) predictor should have at least levels, but can have many

More information

Visual interpretation with normal approximation

Visual interpretation with normal approximation Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation

More information

Sta$s$cal Significance Tes$ng In Theory and In Prac$ce

Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Ben Cartere8e University of Delaware h8p://ir.cis.udel.edu/ictir13tutorial Hypotheses and Experiments Hypothesis: Using an SVM for classifica$on will

More information

Short introduc,on to the

Short introduc,on to the OXFORD NEUROIMAGING PRIMERS Short introduc,on to the An General Introduction Linear Model to Neuroimaging for Neuroimaging Analysis Mark Jenkinson Mark Jenkinson Janine Michael Bijsterbosch Chappell Michael

More information

Class Notes. Examining Repeated Measures Data on Individuals

Class Notes. Examining Repeated Measures Data on Individuals Ronald Heck Week 12: Class Notes 1 Class Notes Examining Repeated Measures Data on Individuals Generalized linear mixed models (GLMM) also provide a means of incorporang longitudinal designs with categorical

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains: CHAPTER 8 Test of Hypotheses Based on a Single Sample Hypothesis testing is the method that decide which of two contradictory claims about the parameter is correct. Here the parameters of interest are

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -27 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -27 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -27 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Frequency factors Normal distribution

More information

Experimental Designs for Planning Efficient Accelerated Life Tests

Experimental Designs for Planning Efficient Accelerated Life Tests Experimental Designs for Planning Efficient Accelerated Life Tests Kangwon Seo and Rong Pan School of Compu@ng, Informa@cs, and Decision Systems Engineering Arizona State University ASTR 2015, Sep 9-11,

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Introduc)on to the Design and Analysis of Experiments. Violet R. Syro)uk School of Compu)ng, Informa)cs, and Decision Systems Engineering

Introduc)on to the Design and Analysis of Experiments. Violet R. Syro)uk School of Compu)ng, Informa)cs, and Decision Systems Engineering Introduc)on to the Design and Analysis of Experiments Violet R. Syro)uk School of Compu)ng, Informa)cs, and Decision Systems Engineering 1 Complex Engineered Systems What makes an engineered system complex?

More information

CH.9 Tests of Hypotheses for a Single Sample

CH.9 Tests of Hypotheses for a Single Sample CH.9 Tests of Hypotheses for a Single Sample Hypotheses testing Tests on the mean of a normal distributionvariance known Tests on the mean of a normal distributionvariance unknown Tests on the variance

More information

Systems Simulation Chapter 7: Random-Number Generation

Systems Simulation Chapter 7: Random-Number Generation Systems Simulation Chapter 7: Random-Number Generation Fatih Cavdur fatihcavdur@uludag.edu.tr April 22, 2014 Introduction Introduction Random Numbers (RNs) are a necessary basic ingredient in the simulation

More information

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc. Hypothesis Tests and Estimation for Population Variances 11-1 Learning Outcomes Outcome 1. Formulate and carry out hypothesis tests for a single population variance. Outcome 2. Develop and interpret confidence

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

One- factor ANOVA. F Ra5o. If H 0 is true. F Distribu5on. If H 1 is true 5/25/12. One- way ANOVA: A supersized independent- samples t- test

One- factor ANOVA. F Ra5o. If H 0 is true. F Distribu5on. If H 1 is true 5/25/12. One- way ANOVA: A supersized independent- samples t- test F Ra5o F = variability between groups variability within groups One- factor ANOVA If H 0 is true random error F = random error " µ F =1 If H 1 is true random error +(treatment effect)2 F = " µ F >1 random

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

Sta$s$cs for Genomics ( )

Sta$s$cs for Genomics ( ) Sta$s$cs for Genomics (140.688) Instructor: Jeff Leek Slide Credits: Rafael Irizarry, John Storey No announcements today. Hypothesis testing Once you have a given score for each gene, how do you decide

More information

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler + Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Least Squares Parameter Es.ma.on

Least Squares Parameter Es.ma.on Least Squares Parameter Es.ma.on Alun L. Lloyd Department of Mathema.cs Biomathema.cs Graduate Program North Carolina State University Aims of this Lecture 1. Model fifng using least squares 2. Quan.fica.on

More information

Introduc)on to RNA- Seq Data Analysis. Dr. Benilton S Carvalho Department of Medical Gene)cs Faculty of Medical Sciences State University of Campinas

Introduc)on to RNA- Seq Data Analysis. Dr. Benilton S Carvalho Department of Medical Gene)cs Faculty of Medical Sciences State University of Campinas Introduc)on to RNA- Seq Data Analysis Dr. Benilton S Carvalho Department of Medical Gene)cs Faculty of Medical Sciences State University of Campinas Material: hep://)ny.cc/rnaseq Slides: hep://)ny.cc/slidesrnaseq

More information

Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia

Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015 Tuan V. Nguyen Tuan V. Nguyen Garvan Ins)tute of Medical Research Sydney, Australia Analysis of variance Between- group and within- group varia)on explained

More information

Data files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav

Data files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav Correlation Data files for today CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav Defining Correlation Co-variation or co-relation between two variables These variables change together

More information

Overview: In addi:on to considering various summary sta:s:cs, it is also common to consider some visual display of the data Outline:

Overview: In addi:on to considering various summary sta:s:cs, it is also common to consider some visual display of the data Outline: Lecture 2: Visual Display of Data Overview: In addi:on to considering various summary sta:s:cs, it is also common to consider some visual display of the data Outline: 1. Histograms 2. ScaCer Plots 3. Assignment

More information

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that Math 47 Homework Assignment 4 Problem 411 Let X 1, X,, X n, X n+1 be a random sample of size n + 1, n > 1, from a distribution that is N(µ, σ ) Let X = n i=1 X i/n and S = n i=1 (X i X) /(n 1) Find the

More information

z-scores z-scores z-scores and the Normal Distribu4on PSYC 300A - Lecture 3 Dr. J. Nicol

z-scores z-scores z-scores and the Normal Distribu4on PSYC 300A - Lecture 3 Dr. J. Nicol z-scores and the Normal Distribu4on PSYC 300A - Lecture 3 Dr. J. Nicol z-scores Knowing a raw score does not inform us about the rela4ve loca4on of that score in the distribu4on The rela4ve loca4on of

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept

More information

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University. Motivations: Detection & Characterization. Lecture 2.

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University. Motivations: Detection & Characterization. Lecture 2. A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 2 Probability basics Fourier transform basics Typical problems Overall mantra: Discovery and cri@cal thinking with data + The

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13 Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of

More information

Lesson 8: Testing for IID Hypothesis with the correlogram

Lesson 8: Testing for IID Hypothesis with the correlogram Lesson 8: Testing for IID Hypothesis with the correlogram Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@ec.univaq.it Testing for i.i.d. Hypothesis

More information

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba B.N.Bandodkar College of Science, Thane Random-Number Generation Mrs M.J.Gholba Properties of Random Numbers A sequence of random numbers, R, R,., must have two important statistical properties, uniformity

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Numerical Methods in Physics

Numerical Methods in Physics Numerical Methods in Physics Numerische Methoden in der Physik, 515.421. Instructor: Ass. Prof. Dr. Lilia Boeri Room: PH 03 090 Tel: +43-316- 873 8191 Email Address: l.boeri@tugraz.at Room: TDK Seminarraum

More information

Summary of Columbia REU Research. ATLAS: Summer 2014

Summary of Columbia REU Research. ATLAS: Summer 2014 Summary of Columbia REU Research ATLAS: Summer 2014 ì 1 2 Outline ì Brief Analysis Summary ì Overview of Sta

More information

Announcements. Topics: Homework: - sec0ons 1.2, 1.3, and 2.1 * Read these sec0ons and study solved examples in your textbook!

Announcements. Topics: Homework: - sec0ons 1.2, 1.3, and 2.1 * Read these sec0ons and study solved examples in your textbook! Announcements Topics: - sec0ons 1.2, 1.3, and 2.1 * Read these sec0ons and study solved examples in your textbook! Homework: - review lecture notes thoroughly - work on prac0ce problems from the textbook

More information

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Lecture Notes 1 Confidence intervals on mean Normal Distribution CL = x ± t * 1-α 1- α,n-1 s n Log-Normal Distribution CL = exp 1-α CL1-

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

Hypothesis Testing. ECE 3530 Spring Antonio Paiva Hypothesis Testing ECE 3530 Spring 2010 Antonio Paiva What is hypothesis testing? A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove that a hypothesis is

More information

Inferences for Correlation

Inferences for Correlation Inferences for Correlation Quantitative Methods II Plan for Today Recall: correlation coefficient Bivariate normal distributions Hypotheses testing for population correlation Confidence intervals for population

More information

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables. Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:

More information

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts Statistical methods for comparing multiple groups Lecture 7: ANOVA Sandy Eckel seckel@jhsph.edu 30 April 2008 Continuous data: comparing multiple means Analysis of variance Binary data: comparing multiple

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats Materials Needed: Bags of popcorn, watch with second hand or microwave with digital timer. Instructions: Follow the instructions on the

More information

CHAPTER 7. Hypothesis Testing

CHAPTER 7. Hypothesis Testing CHAPTER 7 Hypothesis Testing A hypothesis is a statement about one or more populations, and usually deal with population parameters, such as means or standard deviations. A research hypothesis is a conjecture

More information

IEOR165 Discussion Week 12

IEOR165 Discussion Week 12 IEOR165 Discussion Week 12 Sheng Liu University of California, Berkeley Apr 15, 2016 Outline 1 Type I errors & Type II errors 2 Multiple Testing 3 ANOVA IEOR165 Discussion Sheng Liu 2 Type I errors & Type

More information

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002 EE/CpE 345 Modeling and Simulation Class 0 November 8, 2002 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation

More information

A 21 spontaneous radia:ve decay (s - 1 ) B 12 induced excita:on (via photons) [B 12 U ν s - 1 ]

A 21 spontaneous radia:ve decay (s - 1 ) B 12 induced excita:on (via photons) [B 12 U ν s - 1 ] Collisionally- excited emission Lines Excita:on and emission in a - level atom Emission Lines Emission lines result from photon- emiang downward transi:ons, and provide key informa:on on the emiang objects,

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Collisionally- excited emission Lines

Collisionally- excited emission Lines Collisionally- excited emission Lines Excita9on and emission in a 2- level atom Emission Lines Emission lines result from photon- emi@ng downward transi9ons, and provide key informa9on on the emi@ng objects,

More information

Topic 21 Goodness of Fit

Topic 21 Goodness of Fit Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known

More information

Bias/variance tradeoff, Model assessment and selec+on

Bias/variance tradeoff, Model assessment and selec+on Applied induc+ve learning Bias/variance tradeoff, Model assessment and selec+on Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège October 29, 2012 1 Supervised

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT): Lecture Three Normal theory null distributions Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT): A random variable which is a sum of many

More information

Part III: Unstructured Data

Part III: Unstructured Data Inf1-DA 2010 2011 III: 51 / 89 Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval Statistical Analysis of Data: III.2 Data scales and summary statistics III.3 Hypothesis

More information

Hypothesis testing. 1 Principle of hypothesis testing 2

Hypothesis testing. 1 Principle of hypothesis testing 2 Hypothesis testing Contents 1 Principle of hypothesis testing One sample tests 3.1 Tests on Mean of a Normal distribution..................... 3. Tests on Variance of a Normal distribution....................

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Lecture 9. ANOVA: Random-effects model, sample size

Lecture 9. ANOVA: Random-effects model, sample size Lecture 9. ANOVA: Random-effects model, sample size Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regressions and Analysis of Variance fall 2015 Fixed or random? Is it reasonable

More information

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing ECO22Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Data Mining. Chapter 5. Credibility: Evaluating What s Been Learned

Data Mining. Chapter 5. Credibility: Evaluating What s Been Learned Data Mining Chapter 5. Credibility: Evaluating What s Been Learned 1 Evaluating how different methods work Evaluation Large training set: no problem Quality data is scarce. Oil slicks: a skilled & labor-intensive

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Confounding in gene+c associa+on studies q What is it? q What is the effect? q How to detect it?

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University

An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University Why Do We Care? Necessity in today s labs Principled approach:

More information

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03 STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

EE/CpE 345. Modeling and Simulation. Fall Class 9

EE/CpE 345. Modeling and Simulation. Fall Class 9 EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior

More information

40.2. Interval Estimation for the Variance. Introduction. Prerequisites. Learning Outcomes

40.2. Interval Estimation for the Variance. Introduction. Prerequisites. Learning Outcomes Interval Estimation for the Variance 40.2 Introduction In Section 40.1 we have seen that the sampling distribution of the sample mean, when the data come from a normal distribution (and even, in large

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 6: Tests of Hypotheses Contingency Analysis Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information