Chapter 11. Hypothesis Testing (II)
|
|
- Dorothy Collins
- 5 years ago
- Views:
Transcription
1 Chapter 11. Hypothesis Testing (II) 11.1 Likelihood Ratio Tests one of the most popular ways of constructing tests when both null and alternative hypotheses are composite (i.e. not a single point). Let X f (, θ). Consider hypotheses H 0 : θ Θ 0 vs H 1 : θ Θ Θ 0. The likelihood ratio test will reject H 0 for the large values of the statistic LR = LR(X) sup θ Θ f (X, θ) sup θ Θ0 f (X, θ) = f (X, θ) / f (X, θ), where θ the (unconstrained) MLE, and θ is the constrained MLE under hypothesis H 0.
2 Remark. (i) It is easy to see that LR 1. (ii) The exact sampling distributions of LR are usually unknown, except in a few special cases. Example 1. (One-sample t -test) Let X = (X 1,, X n ) τ be a random sample from N (µ, σ 2 ). We are interested in testing hypotheses H 0 : µ = µ 0 against H 1 : µ µ 0, where µ 0 is given, and σ 2 is unknown and is a nuisance parameter. Now both H 0 and H 1 are composite. The likelihood function is L(µ, σ 2 ) = C σ n exp { 1 2σ 2 n (X i µ) 2}.
3 The unconstrained MLEs are and the constrained MLE is The LR-ratio statistic is then Since µ = X, σ 2 = 1 n σ 2 = 1 n n (X i X ) 2, n (X i µ 0 ) 2. LR = L( µ, σ2 ) L(µ 0, σ 2 ) = ( σ2 / σ 2 ) n/2. n σ 2 = n σ 2 + n( X µ 0 ) 2,
4 it holds that σ 2 / σ 2 = 1 + T 2 /(n 1), where T = n( X µ 0 ) /{ 1 n 1 n (X i X ) 2} 1/2. Note that T t n 1 under H 0. The LRT will reject H 0 iff T > t n 1,α/2, where t k,α is the upper α-point of the t -distribution with k degrees of freedom.
5 Asymptotic Distribution of Likelihood ratio test statistic Let X = (X 1,, X n ) τ, and assume certain regularity conditions. Then as n, the distribution of 2 log(lr) under H 0 converges to the χ 2 - distribution with d d 0 degrees of freedom, where d is the dimension of Θ and d 0 is the dimension of Θ 0. To make the computation of dimension easy, reparametrisation is often adopted. Suppose that the parameter θ may be written in two parts θ = (ψ, λ) where ψ is k 1 parameter of interest, and λ is of little interest and is called nuisance parameters. The hypotheses to be tested may be expressed as H 0 : ψ = ψ 0 vs H 1 : ψ ψ 0.
6 Now the LR-statistic is of the form LR = L( ψ, λ; X) L(ψ 0, λ; X), where ( ψ, λ) is unconstrained MLE while λ is the constrained MLE of λ subject to ψ = ψ 0. Then as n, 2 log(lr) D χ 2 k under H 0. Example 2. Let X 1,, X n be independent, and X j N (µ j, 1). Consider the null hypothesis H 0 : µ 1 = = µ n.
7 The likelihood function is L(µ 1,, µ n ) = C exp { 1 2 n (X j µ j ) 2}, where C > 0 is a constant independent of µ j. Then the unconstrained MLE are µ j = X j and the constrained MLE is µ = X. Hence Hence LR = L( µ 1,, µ n ) L( µ,, µ) 2 log(lr) = = exp { 1 2 n (X j X ) 2}. n (X j X ) 2 χn 1 2 under H 0, which is true for any finite n as well. How to calculate the degree of freedom?
8 Since d = n, d 0 = 1, the d.f. is d d 0 = n 1. Alternatively we may adopt the following reparametrisation: µ j = µ 1 + ψ j for 2 j n. Then the null hypothesis can be expressed as H 0 : ψ 2 = = ψ n = 0. Therefore ψ = (ψ 2,, ψ n ) τ has n 1 component, i.e. k = n The permutation test a nonparametric method for testing if two distributions are the same. It is particularly appealing when sample sizes are small, as it does not rely on any asymptotic theory.
9 Let X 1,, X m be sample from distribution F x and Y 1,,Y n be a sample from distribution F y. We are interested in testing H 0 : F x = F y versus H 1 : F x F y. Key idea: under H 0, {X 1,, X m,y 1,,Y n } form a sample of size m + n from a single distribution. Choose a test statistic T = T (X 1,, X m,y 1,,Y n ) which is capable to tell the difference between the two distribution, e.g. T = X Y, or T = X Y 2 + Sx 2 Sy 2. Consider all (m + n)! permutations of (X 1,, X m,y 1,,Y n ), compute the test statistic T for each permutation, yielding the values T 1,,T (m+n)!.
10 The p-value of the test is defined as p = 1 (m + n)! (m+n)! I (T j > t obs ), where t obs = T (X 1,, X m,y 1,,Y n ). We reject H 0 at the significance level α if p α. Note. When H 0 holds, all those (m + n)! T j s are on the equal footing, and t obs = T (X 1,, X m,y 1,,Y n ) is one of them. Therefore t obs is unlikely to be an extreme value among T j s.
11 Algorithm for Permutation Tests: 1. Compute t obs = T (X 1,, X m,y 1,,Y n ). 2. Randomly permute the data. Compute T again using the permuted date. 3. Repeat Step 2 B times, and let T 1,,T B denote the resulting values. 4. The approximate p-value is B 1 1 j B I (T j > t obs ). Remark. Let Z = (X 1,, X m,y 1,,Y n ) (Z <- c(x,y)). A permutation of Z may be obtained in R as Zp <- sample(z, n+m) You may also use the R-function sample.int: k <- sample.int(n+m, n+m) Now k is a permutation of {1, 2,, n + m}.
12 Example 3. Class A was taught using detailed PowerPoint slides. The marks in the final exam are 45, 55, 39, 60, 64, 85, 80, 64, 48, 62, 75, 77, 50. Students in Class B were required to read books and answer questions in class discussions. The marks in the final exam are 45, 59, 48, 74, 73, 78, 66, 69, 79, 81, 60, 52. Can we infer that the marks from the two classes are significantly different? We conduct the permutation test using the test statistic T = X Y in R: > x <- c(45, 55, 39, 60, 64, 85, 80, 64, 48, 62, 75, 77, 50) > y <- c(45, 59, 48, 74, 73, 78, 66, 69, 79, 81, 60, 52) > length(x); length(y) [1] 13
13 [1] 12 > summary(x) Min. 1st Qu. Median Mean 3rd Qu. Max > summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max > Tobs <- abs(mean(x)-mean(y)) > z <- c(x,y) > k <- 0 > for(i in 1:5000) { + zp <- sample(z, 25) # zp is a permutation of z + T <- abs(mean(zp[1:13])-mean(zp[14:25])) + if(t>tobs) k <- k+1 + } cat("p-value:", k/5000, "\n") p-value: Since p-value is , we cannot reject the null-hypothesis that the mark distributions of the two classes are the same.
14 We also apply the t -sample, obtaining the similar results: > t.test(x, y, var.equal=t) # mu=0 is the default Two Sample t-test data: x and y t = , df = 23, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
15 11.3 χ 2 -tests Goodness-of-fit tests: to test if a given distribution fits the data. Let {X 1,, X n } be a random sample from a discrete distribution of k categories denoted by 1,, k. Denote the probability function Then p j 0 and k p j = 1. p j = P (X i = j ), j = 1,, k. Typically n >> k. Therefore the data are often compressed into a table: Category 1 2 k Frequency Z 1 Z 2 Z k
16 where Obviously k Z j = n. Z j = No. of X i s equal to j, j = 1,, k. To test the null hypothesis H 0 : p i = p i (θ), i = 1,, k, where the function forms of p i (θ) are known but the parameter θ is unknown. For example, p i (θ) = θ i 1 e θ /(i 1)! (i.e. Poisson distribution). We first estimate θ by, for example, its MLE θ. The expected frequencies under H 0 are E i = np i ( θ), i = 1,, k.
17 Listing them together with observed frequencies, we have Category 1 2 k Frequency Z 1 Z 2 Z k Expected frequency E 1 E 2 E k If H 0 is true, we expect Z j E j = np j ( θ) when n is large, as, by the LLN, it holds Z j n = 1 n I (X n i = j ) E {I (X i = j )} = P (X i = j ) = p j (θ). i =1 Test statistic: T = k (Z j E j ) 2/ E j. Theorem. Under H 0, T components in θ. D χ 2 k 1 d as n, where d is the number of
18 Remark. (i) It is important that E i 5 at least. This may be achieved by combining together the categories with smaller expected frequencies. (ii) When p j are completely specified (i.e. known) under H 0, d = 0. Example 4. A supermarket recorded the numbers of arrivals over 100 one-minute intervals. The data were summarized as follows No. of arrivals Frequency Do the data match a Poisson distribution? The null hypothesis is H 0 : p i = λ i e λ /i! for i = 0, 1,. We find the MLE for λ first.
19 The likelihood function: L(λ) = 100 i =1 λx i 100 X i! e λ λ i =1 X i e 100λ. The log-likelihood function: l (λ) = log(λ) 100 i =1 X i 100λ. Let d dλ l (λ) = 0, leading to λ = i =1 X i = X. Since we are only given the counts Z j instead of X i, we need to compute X from Z j. Recall Z j = no. of X i equal to j. Hence = X = 1 n n X i = 1 n i =1 k j Z j 1 ( ) = 1.81.
20 With λ = 1.81, the expected frequencies are E i = n p i ( λ) = 100 (1.81)i e 1.81, i! i = 0, 1,. We combine the last three categories to make sure E i 5. No. of arrivals Total Frequency Z i p i ( λ) = λ i e λ /i! Expected frequency E i Difference Z i E i (Z i E i ) 2 /E i Note under H 0, T = 4 i =0 (Z i E i ) 2 /E i χ = χ2. Since T = < 3 χ 0.10, 2 = 6.25, we cannot reject the assumption that the data follow a 3 Poisson distribution.
21 Remark. (i) The goodness-of-fit test has been widely used in practice. However we should bear in mind that when H 0 cannot be rejected, we are not in the position to conclude that the assumed distribution is true, as not reject accept" (ii) The above test may be used to test the goodness-of-fit of a continuous distribution via discretization. However there exist more appropriate methods such as Kolmogorov-Smirnov test and Cramér-von Mises test, which deal with the goodness-of-fit for continuous distributions directly.
22 Tests for contingency tables Tests for independence of two discrete random variables Let (X,Y ) be two discrete random variables, and X have r categories and Y have c categories. Let p ij = P (X = i, Y = j ), i = 1,, r, j = 1,, c. Then p ij 0 and i,j p ij r i =1 cj =1 p ij = 1. Let p i = P (X = i ) and p j = P (Y = j ). It is easy to see that p i = c P (X = i, Y = j ) = c p ij = j p ij Similarly, p j = i p ij
23 X and Y are independent iff p ij = p i p j for i = 1,, r and j = 1,, c. Suppose we have n pairs of observations from (X,Y ). The data are presented in a contingency table below Y 1 2 c 1 Z 11 Z 12 Z 1c X 2. Z 21. Z 22.. Z 2c. r Z r 1 Z r 2 Z r c where Z ij = no. of the pairs equal to (i, j ).
24 It is often useful to add the marginals into the table: c r r Z i = Z ij, Z j = Z ij, Z = Z i = i =1 i =1 c Z j = n Y 1 2 c 1 Z 11 Z 12 Z 1c Z 1 X 2. Z 21. Z 22.. Z 2c. Z 2. r Z r 1 Z r 2 Z r c Z r Z 1 Z 2 Z c Z = n
25 We are interested in testing the independence H 0 : p ij = p i p j, i = 1,, r, j = 1,, c. Under H 0, a natural estimator for p ij is p ij = p i p j = Z i n Hence the expected frequency at the (i, j )-th cell is E ij = n p ij = Z i Z j /n = Z i Z j /Z, i = 1,, r, j = 1,, c. If H 0 is true, we expect Z ij defined as T = Z j n E ij. The goodness-of-fit test statistic is r i =1 c (Z ij E ij ) 2/ E ij. We reject H 0 for large values of T.
26 Under H 0, T χ 2 p d, where p = no. of cells -1 = r c 1 d = no. of estimated free parameters = r + c 2. Note. 1. The sum of r c counts Z ij is n fixed. So knowing r c 1 of them, the other one is also known. This is why p = r c The estimated parameters are p i and p j. But r i =1 p i = 1 and c p j = 1. Hence d = (r 1) + (c 1) = r + c For testing independence, it always holds that Z i E i = 0 and Z j E j = 0. Those are useful facts in checking for computational errors. The proofs are simple, as, for example, Z i E i = Z i E ij = Z i j j Z i Z j Z = Z i Z i Z Z = 0.
27 Theorem. Under H 0, the limiting distribution of T is χ 2 with (r 1)(c 1) degrees of freedom, as n.
28 Example. The table below lists the counts on the beer preference and gender of beer drinker from randomly selected 150 individuals. Test at the 5% significance level the hypothesis that the preference is independent of the gender. Beer preference Light ale Lager Bitter Total Gender Male Female Total The expected frequencies are: E 11 = E 21 = = 26.67, E 12 = = 23.33, E 22 = = 37.33, E 13 = = 32.67, E 33 = = 16, = 14.
29 Under the null hypothesis of independence, T = i,j (Z ij E ij ) 2 /E ij χ 2 2. Note the degree freedom is (2 1)(3 1) = 2. Since T = > χ 0.05, 2 = 5.991, we reject the null hypothesis, i.e. there 2 is significant evidence from the data indicating that the beer preference and the gender of beer drinker are not independent.
30 Tests for several binomial distributions Consider a real example: Three independent samples of sizes 80, 120 and 100 are taken respectively from single, married, and widowed or divorced persons. Each individual was asked to if friends and social life or job and primary activity contributes most to their general well-being. The counts from the three samples are summarized in the table below. Friends and social life Job or primary activity Single Married Widowed or divorced Total
31 Conditional Inference: Sometimes we conduct inference under the assumption that all the row (or column) margins are fixed. Different from the tables for independent tests, now Z 1j Bi n(z j, p 1j ), j = 1, 2, 3, where Z j are fixed constants sample sizes. Furthermore, p 2j = 1 p 1j. We are interested in testing hypothesis H 0 : p 11 = p 12 = p 13. Under H 0, the three independent samples may be seen from the same population. Furthermore, Z 11 + Z 12 + Z 13 Bi n(z 1 + Z 2 + Z 3, p),
32 where p denotes the common value of p 11, p 12 and p 13. Therefore the MLE is p = Z 11 + Z 12 + Z 13 Z 1 + Z 2 + Z 3 = The expected frequencies are = E 1j = pz j and E 2j = Z j E 1j, j = 1, 2, Total Total 0 0 0
33 Total
34 Under H 0, T = i,j (Z ij E ij ) 2 /E ij χ 2 p d = χ2 2, where p = no. of free counts Z ij = 3 d = no. of estimated free parameters = 1. Since T = < χ ,2 = 4.605, we cannot reject H 0, i.e. there is no significant difference among the three populations in terms of choosing between F&SL and J&PA as the more important factor towards their general well-being. Remark. Similar to the independence tests, it holds that Z i E i = 0 and Z j E j = 0.
35 Tests for r c tables a general description In general, we may test for different types of the structure in a r c table, for example, the symmetry (p ij = p j i ). The key is to compute expected frequencies E ij under null hypothesis H 0. Under H 0, the test statistic T = r i =1 c (Z ij E ij ) 2 p = no. of free counts among Z ij, E ij χ 2 p d, d = no. of the estimated free parameters. We reject H 0 if T > χ 2 α, p d. Remark. The R-function chisq.test performs both the goodness-of-fit test and the contingency table test.
Lecture 28 Chi-Square Analysis
Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationChapter 10. Hypothesis Testing (I)
Chapter 10. Hypothesis Testing (I) Hypothesis Testing, together with statistical estimation, are the two most frequently used statistical inference methods. It addresses a different type of practical problems
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationWe know from STAT.1030 that the relevant test statistic for equality of proportions is:
2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which
More informationNominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers
Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationSTAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).
STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationSTP 226 EXAMPLE EXAM #3 INSTRUCTOR:
STP 226 EXAMPLE EXAM #3 INSTRUCTOR: Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned. Signed Date PRINTED
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two
More informationInferential statistics
Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,
More informationChi-Squared Tests. Semester 1. Chi-Squared Tests
Semester 1 Goodness of Fit Up to now, we have tested hypotheses concerning the values of population parameters such as the population mean or proportion. We have not considered testing hypotheses about
More informationCONTINUOUS RANDOM VARIABLES
the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More information1 Hypothesis testing for a single mean
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More information4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49
4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationInferential Statistics
Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free
More informationContents 1. Contents
Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationLecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests
Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.
More informationNonparametric Location Tests: k-sample
Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationGoodness of Fit Goodness of fit - 2 classes
Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence
More informationMATH5745 Multivariate Methods Lecture 07
MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction
More informationTopic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.
Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick
More informationHypothesis testing: theory and methods
Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable
More informationIntroduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution
Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis
More informationSTAT FINAL EXAM
STAT101 2013 FINAL EXAM This exam is 2 hours long. It is closed book but you can use an A-4 size cheat sheet. There are 10 questions. Questions are not of equal weight. You may need a calculator for some
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationLet us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided
Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationPsych Jan. 5, 2005
Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationDr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)
Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting
More informationMore on nuisance parameters
BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution
More informationMathematical Statistics
Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics
More informationSTAT2912: Statistical Tests. Solution week 12
STAT2912: Statistical Tests Solution week 12 1. A behavioural biologist believes that performance of a laboratory rat on an intelligence test depends, to a large extent, on the amount of protein in the
More informationStat 135 Fall 2013 FINAL EXAM December 18, 2013
Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Name: Person on right SID: Person on left There will be one, double sided, handwritten, 8.5in x 11in page of notes allowed during the exam. The exam is closed
More informationPsych 230. Psychological Measurement and Statistics
Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State
More informationThe Multinomial Model
The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient
More informationSome General Types of Tests
Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More information2.3 Analysis of Categorical Data
90 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING 2.3 Analysis of Categorical Data 2.3.1 The Multinomial Probability Distribution A mulinomial random variable is a generalization of the binomial rv. It results
More informationHYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC
1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The
More informationChapter 10: Chi-Square and F Distributions
Chapter 10: Chi-Square and F Distributions Chapter Notes 1 Chi-Square: Tests of Independence 2 4 & of Homogeneity 2 Chi-Square: Goodness of Fit 5 6 3 Testing & Estimating a Single Variance 7 10 or Standard
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More information12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS
CDR4_BERE601_11_SE_C1QXD 1//08 1:0 PM Page 1 110: (Student CD-ROM Topic) Chi-Square Goodness-of-Fit Tests CD1-1 110 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS In this section, χ goodness-of-fit
More informationMath 181B Homework 1 Solution
Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ
More informationMean Vector Inferences
Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector
More informationLecture 26: Likelihood ratio tests
Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for
More informationQuestions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.
Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized
More informationModule 10: Analysis of Categorical Data Statistics (OA3102)
Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this
More information8 Comparing Two Populations via Independent Samples, Part 1/2
8 Comparing Two Populations via Independent Samples, Part /2 We ll study these ways of comparing two populations from independent samples:. The Two-Sample T-Test (Normal with Equal Variances) 2. The Welch
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More information# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance
Practice Final Exam Statistical Methods and Models - Math 410, Fall 2011 December 4, 2011 You may use a calculator, and you may bring in one sheet (8.5 by 11 or A4) of notes. Otherwise closed book. The
More informationRama Nada. -Ensherah Mokheemer. 1 P a g e
- 9 - Rama Nada -Ensherah Mokheemer - 1 P a g e Quick revision: Remember from the last lecture that chi square is an example of nonparametric test, other examples include Kruskal Wallis, Mann Whitney and
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationTopic 19 Extensions on the Likelihood Ratio
Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis
More informationChapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests
Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we
More informationStatistics for Managers Using Microsoft Excel
Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap
More informationST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios
ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories
More informationQuestion. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?
Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.
More informationQualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama
Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours
More informationProblem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56
STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your
More informationPart III: Unstructured Data
Inf1-DA 2010 2011 III: 51 / 89 Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval Statistical Analysis of Data: III.2 Data scales and summary statistics III.3 Hypothesis
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationInferences for Correlation
Inferences for Correlation Quantitative Methods II Plan for Today Recall: correlation coefficient Bivariate normal distributions Hypotheses testing for population correlation Confidence intervals for population
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationStatistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017
Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More information6 Single Sample Methods for a Location Parameter
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually
More informationSTAT 730 Chapter 5: Hypothesis Testing
STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The
More information