2.3 Analysis of Categorical Data
|
|
- James Lynch
- 5 years ago
- Views:
Transcription
1 90 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING 2.3 Analysis of Categorical Data The Multinomial Probability Distribution A mulinomial random variable is a generalization of the binomial rv. It results from experiments consisting of n trials with k possible outcomes per trial where k 2. For k = 2 it is the binomial variable. Examples of multinomial distributions come from the experiments where we have several categories, such as blood types 0, A, B, AB (discrete rv) or income ranges [0.5K), [5K,10K), [10K, 20K),... (continuous rv). A multinomial experiment has the following properties: 1. The experiment consists ofnidentical trials. 2. The outcome of each trial falls into one ofk classes (cells, categories). 3. The probability p i that the outcome of a single trial falls into categoryiis constant. 4. The trials are independent. 5. The random variables of interest are Y = (Y 1,...,Y k ), where Y i denotes the number of trials whose outcomes fall into categoryi. Note that and p p k = 1 Y Y k = n.
2 2.3. ANALYSIS OF CATEGORICAL DATA 91 Definition 2.8. The random variablesy = (Y 1,...,Y k ) have multinomial distribution with parametersnandp 1,...,p k if the joint pmf fory is given by p Y (y 1,...,y k ) = n! y 1!y 2! y k! py 1 1 py 2 2 py k k, wherep i > 0, k i=1 p i = 1 and k i=1 y i = n andy i = 0,1,2,...,n. It is easy to notice that the marginal distribution of Y i is binomial with parametersnandp i. If we merge all categories except category i then each outcome of a trial will fall into cell i or the other merged cell. So, we have either success or failure with probability p i and 1 p i respectively. Then, E(Y i ) = np i, and var(y i ) = np i (1 p i ) Chi-Square Goodness of Fit Tests Fully Specified Distribution Suppose we are intersected in testing a hypothesis that the cell probabilities take some specified values, that is, H 0 : p 1 = p 10, p 2 = p 20,...,p k = p k0 H 1 : H 0 where H 0 means the null hypothesis is not true. For given n, underh 0, we have a fully specified multinomial distribution and the expectations of the numbers falling into the categories are E(Y i ) = np i, for i = 1,...,k. Karl Pearson derived a test statistic based on the assumption that if H 0 is true, then the random variablesy i should not differ much from
3 92 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING their expected values. The test statistic is given in the following theorem. Theorem 2.3. The statistic X 2 = k (Y i np i ) 2 i=1 has, asymptotically, the χ 2 k 1 distribution. Proof. We will show the result fork = 2. Then, X 2 = (Y 1 np 1 ) 2 np 1 + (Y 2 np 2 ) 2 np 2 = (Y 1 np 1 ) 2 = (Y 1 np 1 ) 2 + [ Y 1 +np 1 ] 2 np 1 n(1 p 1 ) ( 2 Y 1 np 1 =. np1 (1 p 1 )) np i (2.2) + [n Y 1 n(1 p 1 )] 2 np 1 n(1 p 1 ) = (Y 1 np 1 ) 2 np 1 (1 p 1 ) By the Central Limit Theorem, we have that for large n the standardized random variable Y 1 has, approximately, a standard normal distribution, i.e., Y 1 np 1 np1 (1 p 1 ) N(0,1). That is, ( 2 Y 1 np 1 χ np1 (1 p 1 )) 2 1, approximately. As a rejection region for such a test we choose the right hand side tail of the chi-squared distribution. This is because a small value of the test function, close to zero, would not contradict the null hypothesis as it would mean that the values of the rvs would not be far from their expectations.
4 2.3. ANALYSIS OF CATEGORICAL DATA 93 Example According to genetic theory the seeds collected from a field of pink pea should produce plants with white, pink or red flowers in the proportion 1:2:1. Of 400 plants grown from such seeds, 93 had white flowers, 211 had pink flowers and 96 had red flowers. Do these results contradict the genetic theory? Let X denote a random variable with the discrete distribution given by 1, if i = 1; 4 1 P(X = i) = 2, if i = 2;, if i = 3, where i = 1,2,3 denotes white, pink and red colour, respectively. Here we have a fully specified distribution. The question is whether the data give evidence against this distribution. If this distribution is true, then the expected numbers of pea plants with white, pink and red flowers, respectively, are n 1 4, n1 2 and n1 4. In the experiment n = 400 plants were observed, hence we should expect the numbers to be 100, 200 and 100. Denote the cumulative distribution function of this rv by F 0. Then, we can write the null and alternative hypotheses are 1 4 H 0 : F(x) = F 0 (x) for all x H 1 : H 0 The observed and expected values are often put together in the frequency table. Here, we have Category White Pink Red Observed Frequency Expected Frequency The value of test function is X 2 obs = (93 100) ( ) (96 100)2 100 =
5 94 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING For α = 0.1 we have χ 2 2;0.1 = Hence, there is no evidence to reject the null hypothesis. The experimental data do not contradict the hypothesis that the proportions of the white, pink and red flowers should be 1:2:1. Families of Distributions The chi-square goodness of fit tests are also used to verify hypotheses about families of distributions. For example, we may need to check that a random variable comes from a normal population (a common assumption of many tests) or that a Poisson distribution may be used to model some observations. These tests are also based on categories, which may be sets of values in a discrete case or real intervals in a continuous case. However, the probabilities that the variable is in a given category now depend on unknown parameters of the distribution which we test. In such situation we replace the parameters with their estimates and use the following chi-square test function. X 2 = k (Y i np i ( ϑ)) 2 i=1 np i ( ϑ) χ 2 k p 1, approximately, (2.3) where p is the number of estimated parametersϑ = (ϑ 1,...,ϑ p ). Example Discrete random variable. The number of accidents per week at an intersection was checked forn = 50 weeks with the results given in the table below. y or more Observed Frequency
6 2.3. ANALYSIS OF CATEGORICAL DATA 95 Test the hypothesis that the random variable Y has a Poisson distribution, assuming that the observations are independent. The null and alternative hypotheses are H 0 : Y Poisson(λ) H 1 : H 0 If the null hypothesis is true, than the pmf is P(Y = y;λ) = λy e λ, y = 0,1,2,... y! To calculate the estimates of expected frequencies np i ( λ) we need to calculate the estimate ofλ. We know that a good estimator ofλis the sample mean Y. It is so called Maximum Likelihood Estimator, which best evaluates the parameter for a given data set. Here we have λ obs = 1 50 ( ( ) ) = = 0.48 Note that the fourth category has no entries. In fact, the categories with very small numbers of observations need to be merged with the neighboring cells so that no category has less than five entries. In this case we merge the last cell with the third one. Under the null hypothesis, the probabilities for these cells are p 1 (λ) = P(Y = 0) = e λ, p 2 (λ) = P(Y = 1) = λe λ, p 3 (λ) = 1 p 1 (λ) p 2 (λ) Thus, for the given data we obtain the following estimates of the
7 96 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING expected frequencies: n p 1 ( λ obs ) = 50 e 0.48 = = 30.95, n p 2 ( λ obs ) = e 0.48 = = 14.85, n p 3 ( λ obs ) = 50 ( ) = This gives the frequency table y or more Observed Frequency Estimated Expected Frequency and the value of test function (2.3) with k p 1 = = 1 degree of freedom X 2 obs = ( ) ( ) (6 4.20) = The upper 100α% point of the chi-square distribution with one degree of freedom at α = 0.1 is χ 2 1;0.1 = Hence, there is no evidence in the data to reject the null hypothesis, which says that the number of accidents at the junction follows a Poisson distribution. Example Continuous random variable. An astronomer is interested in the numbers of cloudless nights at a prospective telescope site. He got the average value of cloudless nights over the last 87 years equal to x = and the estimate of the sample variance s 2 = Also, he has available counts of years of numbers of cloudless nights given in intervals presented below.
8 2.3. ANALYSIS OF CATEGORICAL DATA 97 Observed Interval Frequency y i 160 or below or above 2 The question is whether X - the number of cloudless nights at the site - can be modelled by a normal distribution. That is the null hypothesis is H 0 : X N(µ,σ 2 ). We can use the test function (2.3), but some of the classes have too few entries. Hence, we will combine first three cells and also the last two cells. Also, to obtain estimates of the expected frequencies, we need to calculate the probabilities of X belonging to each class, that is ( ai X P(a i < X < b i ) = P < Z < b ) ( ) ( i X bi X ai X Φ Φ S S S S where a i and b i denote the limits of class i and Φ(z) is a cdf of Z N(0,1). For example ( ( ) P(200 < X < 220) = Φ ) Φ = 1 Φ(0.7717) (1 Φ(1.4354)) = Φ(1.4354) Φ(0.7717)= = ),
9 98 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING This gives values p i and so it gives estimates of the expected frequenciesn p i for each class, as shown in the table below. Observed Estimated Estimated Interval frequencies cell probabilities frequencies y i p i n p i 200 or below or above The observed value of the test statistic is Xobs 2 = There are three degrees of freedom and the rejection region at α = 0.05 level of significance is (7.815, ) while at α = 0.01 it is (11.34, ). Hence, the data give some evidence against the null hypothesis, but it is not strong evidence as at α = 0.01 the value of test statistic is not in the rejection region.
TUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationLing 289 Contingency Table Statistics
Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More information11-2 Multinomial Experiment
Chapter 11 Multinomial Experiments and Contingency Tables 1 Chapter 11 Multinomial Experiments and Contingency Tables 11-11 Overview 11-2 Multinomial Experiments: Goodness-of-fitfit 11-3 Contingency Tables:
More informationML Testing (Likelihood Ratio Testing) for non-gaussian models
ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationModule 10: Analysis of Categorical Data Statistics (OA3102)
Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationGEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs
STATISTICS 4 Summary Notes. Geometric and Exponential Distributions GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs P(X = x) = ( p) x p x =,, 3,...
More informationLecture 3. Discrete Random Variables
Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More informationMath 152. Rumbos Fall Solutions to Exam #2
Math 152. Rumbos Fall 2009 1 Solutions to Exam #2 1. Define the following terms: (a) Significance level of a hypothesis test. Answer: The significance level, α, of a hypothesis test is the largest probability
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationBinomial and Poisson Probability Distributions
Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What
More informationChapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments
Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments We consider two kinds of random variables: discrete and continuous random variables. For discrete random
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationA Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.
A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,
More informationMATH4427 Notebook 2 Fall Semester 2017/2018
MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationSTAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).
STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example
More information4.5.1 The use of 2 log Λ when θ is scalar
4.5. ASYMPTOTIC FORM OF THE G.L.R.T. 97 4.5.1 The use of 2 log Λ when θ is scalar Suppose we wish to test the hypothesis NH : θ = θ where θ is a given value against the alternative AH : θ θ on the basis
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationS2 QUESTIONS TAKEN FROM JANUARY 2006, JANUARY 2007, JANUARY 2008, JANUARY 2009
S2 QUESTIONS TAKEN FROM JANUARY 2006, JANUARY 2007, JANUARY 2008, JANUARY 2009 SECTION 1 The binomial and Poisson distributions. Students will be expected to use these distributions to model a real-world
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationComputer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.
Simulation Discrete-Event System Simulation Chapter 4 Statistical Models in Simulation Purpose & Overview The world the model-builder sees is probabilistic rather than deterministic. Some statistical model
More information15 Discrete Distributions
Lecture Note 6 Special Distributions (Discrete and Continuous) MIT 4.30 Spring 006 Herman Bennett 5 Discrete Distributions We have already seen the binomial distribution and the uniform distribution. 5.
More informationSlides 8: Statistical Models in Simulation
Slides 8: Statistical Models in Simulation Purpose and Overview The world the model-builder sees is probabilistic rather than deterministic: Some statistical model might well describe the variations. An
More information1 Review of Probability and Distributions
Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote
More informationHYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC
1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More information(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.
54 We are given the marginal pdfs of Y and Y You should note that Y gamma(4, Y exponential( E(Y = 4, V (Y = 4, E(Y =, and V (Y = 4 (a With U = Y Y, we have E(U = E(Y Y = E(Y E(Y = 4 = (b Because Y and
More informationExample. χ 2 = Continued on the next page. All cells
Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More information3.4. The Binomial Probability Distribution
3.4. The Binomial Probability Distribution Objectives. Binomial experiment. Binomial random variable. Using binomial tables. Mean and variance of binomial distribution. 3.4.1. Four Conditions that determined
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationProbability distributions. Probability Distribution Functions. Probability distributions (contd.) Binomial distribution
Probability distributions Probability Distribution Functions G. Jogesh Babu Department of Statistics Penn State University September 27, 2011 http://en.wikipedia.org/wiki/probability_distribution We discuss
More informationTopic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.
Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationOne-Way Tables and Goodness of Fit
Stat 504, Lecture 5 1 One-Way Tables and Goodness of Fit Key concepts: One-way Frequency Table Pearson goodness-of-fit statistic Deviance statistic Pearson residuals Objectives: Learn how to compute the
More informationStatistics 224 Solution key to EXAM 2 FALL 2007 Friday 11/2/07 Professor Michael Iltis (Lecture 2)
NOTE : For the purpose of review, I have added some additional parts not found on the original exam. These parts are indicated with a ** beside them Statistics 224 Solution key to EXAM 2 FALL 2007 Friday
More informationChapter 4 Multiple Random Variables
Review for the previous lecture Theorems and Examples: How to obtain the pmf (pdf) of U = g ( X Y 1 ) and V = g ( X Y) Chapter 4 Multiple Random Variables Chapter 43 Bivariate Transformations Continuous
More informationj=1 π j = 1. Let X j be the number
THE χ 2 TEST OF SIMPLE AND COMPOSITE HYPOTHESES 1. Multinomial distributions Suppose we have a multinomial (n,π 1,...,π k ) distribution, where π j is the probability of the jth of k possible outcomes
More informationInference for Categorical Data. Chi-Square Tests for Goodness of Fit and Independence
Chi-Square Tests for Goodness of Fit and Independence Chi-Square Tests In this course, we use chi-square tests in two different ways The chi-square test for goodness-of-fit is used to determine whether
More informationChapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations
Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More informationSection VII. Chi-square test for comparing proportions and frequencies. F test for means
Section VII Chi-square test for comparing proportions and frequencies F test for means 0 proportions: chi-square test Z test for comparing proportions between two independent groups Z = P 1 P 2 SE d SE
More informationLecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk
More information2.5.3 Generalized likelihood ratio tests
25 HYPOTHESIS TESTING 127 253 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : ϑ Θ against H 1 : ϑ Θ\Θ It can be used
More information1.6 Families of Distributions
Your text 1.6. FAMILIES OF DISTRIBUTIONS 15 F(x) 0.20 1.0 0.15 0.8 0.6 Density 0.10 cdf 0.4 0.05 0.2 0.00 a b c 0.0 x Figure 1.1: N(4.5, 2) Distribution Function and Cumulative Distribution Function for
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationTables Table A Table B Table C Table D Table E 675
BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationRandom Variables Example:
Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More informationIntroduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution
Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationLecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014
Lecture 13 Text: A Course in Probability by Weiss 5.5 STAT 225 Introduction to Probability Models February 16, 2014 Whitney Huang Purdue University 13.1 Agenda 1 2 3 13.2 Review So far, we have seen discrete
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationClosed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.
IE 230 Seat # Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators. Score Exam #3a, Spring 2002 Schmeiser Closed book and notes. 60 minutes. 1. True or false. (for each,
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationCSE 312 Final Review: Section AA
CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationMultiple Sample Categorical Data
Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationMathematical Statistics 1 Math A 6330
Mathematical Statistics 1 Math A 6330 Chapter 3 Common Families of Distributions Mohamed I. Riffi Department of Mathematics Islamic University of Gaza September 28, 2015 Outline 1 Subjects of Lecture 04
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More information10: Crosstabs & Independent Proportions
10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church
More informationF79SM STATISTICAL METHODS
F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is
More informationSystem Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models
System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder
More informationHypothesis Testing One Sample Tests
STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal
More informationCONTINUOUS RANDOM VARIABLES
the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode
More informationAdvanced Herd Management Probabilities and distributions
Advanced Herd Management Probabilities and distributions Anders Ringgaard Kristensen Slide 1 Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Distribution
More informationINSTITUTE OF ACTUARIES OF INDIA
INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS 13 th May 2008 Subject CT3 Probability and Mathematical Statistics Time allowed: Three Hours (10.00 13.00 Hrs) Total Marks: 100 INSTRUCTIONS TO THE CANDIDATES
More informationMATH 3670 First Midterm February 17, No books or notes. No cellphone or wireless devices. Write clearly and show your work for every answer.
No books or notes. No cellphone or wireless devices. Write clearly and show your work for every answer. Name: Question: 1 2 3 4 Total Points: 30 20 20 40 110 Score: 1. The following numbers x i, i = 1,...,
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple
More informationFormulas and Tables by Mario F. Triola
Copyright 010 Pearson Education, Inc. Ch. 3: Descriptive Statistics x f # x x f Mean 1x - x s - 1 n 1 x - 1 x s 1n - 1 s B variance s Ch. 4: Probability Mean (frequency table) Standard deviation P1A or
More informationProbability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!
Probability theory and inference statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl paola.grosso@os3.nl (preferred) Roadmap Lecture 1: Monday Sep. 22nd Collecting data Presenting data Descriptive
More informationReading Material for Students
Reading Material for Students Arnab Adhikari Indian Institute of Management Calcutta, Joka, Kolkata 714, India, arnaba1@email.iimcal.ac.in Indranil Biswas Indian Institute of Management Lucknow, Prabandh
More informationMachine Learning. Lecture 3: Logistic Regression. Feng Li.
Machine Learning Lecture 3: Logistic Regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2016 Logistic Regression Classification
More informationSTAT 509 Section 3.4: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.
STAT 509 Section 3.4: Continuous Distributions Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. A continuous random variable is one for which the outcome
More informationGeneralized Linear Models (1/29/13)
STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability
More informationStatistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017
Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to
More informationMath 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14
Math 325 Intro. Probability & Statistics Summer Homework 5: Due 7/3/. Let X and Y be continuous random variables with joint/marginal p.d.f. s f(x, y) 2, x y, f (x) 2( x), x, f 2 (y) 2y, y. Find the conditional
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationWe know from STAT.1030 that the relevant test statistic for equality of proportions is:
2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationEpidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval
Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being
More information