STAC51: Categorical data Analysis

Similar documents
Common Large/Small Sample Tests 1/55

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Last Lecture. Wald Test

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Sampling Distributions, Z-Tests, Power

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

STATISTICAL INFERENCE

Chapter 8: Estimating with Confidence

1 Inferential Methods for Correlation and Regression Analysis

Parameter, Statistic and Random Samples

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

STAT431 Review. X = n. n )

z is the upper tail critical value from the normal distribution

1 Models for Matched Pairs

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Stat 319 Theory of Statistics (2) Exercises

One-Sample Test for Proportion

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Parameter, Statistic and Random Samples

Sample Size Determination (Two or More Samples)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Math 140 Introductory Statistics

Chapter 6 Sampling Distributions

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Chapter 1 (Definitions)

Stat 200 -Testing Summary Page 1

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Final Examination Solutions 17/6/2010

Statistics 20: Final Exam Solutions Summer Session 2007

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Read through these prior to coming to the test and follow them when you take your test.

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

University of California, Los Angeles Department of Statistics. Hypothesis testing

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Chapter 23: Inferences About Means

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Binomial Distribution

Topic 9: Sampling Distributions of Estimators

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

1.010 Uncertainty in Engineering Fall 2008

Frequentist Inference

5. Likelihood Ratio Tests

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

MATH/STAT 352: Lecture 15

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Chapter 22: What is a Test of Significance?

MA238 Assignment 4 Solutions (part a)

Logit regression Logit regression

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Expectation and Variance of a random variable

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Introductory statistics

Statistics 511 Additional Materials

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

STAT 155 Introductory Statistics Chapter 6: Introduction to Inference. Lecture 18: Estimation with Confidence

Topic 9: Sampling Distributions of Estimators

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

Properties and Hypothesis Testing

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Announcements. Unit 5: Inference for Categorical Data Lecture 1: Inference for a single proportion

Topic 9: Sampling Distributions of Estimators

To make comparisons for two populations, consider whether the samples are independent or dependent.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

(7 One- and Two-Sample Estimation Problem )

Data Analysis and Statistical Methods Statistics 651

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Describing the Relation between Two Variables

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

π: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Chi-Squared Tests Math 6070, Spring 2006

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Topic 18: Composite Hypotheses

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

Transcription:

STAC51: Categorical data Aalysis Mahida Samarakoo Jauary 28, 2016 Mahida Samarakoo STAC51: Categorical data Aalysis 1 / 35

Table of cotets Iferece for Proportios 1 Iferece for Proportios Mahida Samarakoo STAC51: Categorical data Aalysis 2 / 35

Commo Steps i Statistical Studies Statistical studies usually ivolves the followig steps. Clearly state the problem or questio you are tryig to aswer Thik about what kid of data will help you aswer the questio Decide o a appropriate statistical model for the data Collect data Clea data (remove outliers etc) ad examie data: data summaries, displays Use the data to estimate parameters of the model Carryout appropriate tests that will aswer your questio. Sometimes you may have to recosider the model ad re-estimate the parameters, redo tests Draw coclusios about your questio Mahida Samarakoo STAC51: Categorical data Aalysis 3 / 35

Commo Steps i Statistical Studies: A simple example We wat to kow if a coi is fair. i.e. P(H) = 0.5 (This is our questio) Data: Toss the coi may times ad observe the outcomes Model for data: A Beroulli model? What is the parameter? π Data Collectio: Decide ad toss the coi times ad recored the outcomes Parameter estimatio: Use the data to estimate the parameter Appropriate hypotheses: H 0 : π = 0.5 agaist H a : π 0.5. Use the data (ad may be the parameter estimates above) to do the tests Draw coclusios: i.e. Does the test above idicate that coi is ot fair? Mahida Samarakoo STAC51: Categorical data Aalysis 4 / 35

Iferece for Proportios Let Y be the umber of successes (i.e. 1 s) i idepedet Beroulli trials with success probability π. The probability of a success π is usually a ukow parameter ad we estimate it by the sample proportio of successes: ˆπ = Y Some properties of ˆπ 1 ˆπ is a ubiased estimator of π (i.e. E(ˆπ) = π). 2 Var(ˆπ) = π(1 π) 3 ˆπ Pr π by WLLN 4 ˆπ approx N(π, π(1 π) ) for large, by CLT Mahida Samarakoo STAC51: Categorical data Aalysis 5 / 35

Estimatio of π: Likelihood fuctio Defiitio (Page 9, text): The likelihood fuctio is the probability of the observed data, expressed as a fuctio of the parameter value. Example: We toss a coi twice (i.e. = 2) ad observe oe head (ad oe tail). P(H) = π, ukow. Fid the likelihood fuctio. Aswer: The umber of heads whe a coi if is tossed twice has a Bi( = 2, π) distributio ad so the likelihood fuctio is l(π) = ( 2 1) π 1 (1 π) 2 1 = 2π(1 π) Mahida Samarakoo STAC51: Categorical data Aalysis 6 / 35

Estimatio of π: Maximum Likelihood Estimator (MLE) Defiitio (MLE): The maximum likelihood estimate (MLE) is the parameter value at which the likelihood fuctio takes its maximum. Example: We toss a coi twice (i.e. = 2) ad observe oe head (ad oe tail). P(H) = π, ukow. Fid the MLE. Aswer: The umber of heads whe a coi if is tossed twice has a Bi( = 2, π) distributio ad so the likelihood fuctio is l(π) = ( 2 1) π 1 (1 π) 2 1 = 2π(1 π). l(π) is maximized whe π = 0.5 ad so the MLE of π is 0.5. Mahida Samarakoo STAC51: Categorical data Aalysis 7 / 35

Some Properties of MLEs If Y 1, Y 2,..., Y are i.i.d. Normal (or may other distributios, such as Poisso), ML estimate of the populatio mea is the sample mea ˆµ = Ȳ. I ordiary regressio (Y Normal) least squares estimates are MLEs. For large, MLEs have approximate ormal samplig distributios (uder weak coditios) Mahida Samarakoo STAC51: Categorical data Aalysis 8 / 35

Example 2 (MLE) Iferece for Proportios A coi with P(H) = π was tossed 20 times ad observed 13 heads. Fid the likelihood fuctio Aswer: l(π) = π 13 (1 π) (20 13) = π 13 (1 π) 7. Plot the likelihood fuctio fid the value value of π that maximizes l(π). Mahida Samarakoo STAC51: Categorical data Aalysis 9 / 35

Example 2 (MLE): R code ad output > #R code for fidig the MLE of pi where Y~Bi(20, pi) > # ad obsered y = 13 > likelihood <- fuctio(pi) { (pi^13)*((1-pi)^7) } > curve(likelihood, from=0, to=1,, xlab="pi", ylab="likelihood(pi)") > optimize(likelihood, iterval=c(0, 1), maximum=true) $maximum [1] 0.6500009 $objective [1] 2.378756e-06 > ablie(v=(seq(0,1,by=0.02)), col="blue", lty="dotted") > ablie(h=(seq(0,2.5e-6,0.25e-6)), col="blue", lty="dotted Mahida Samarakoo STAC51: Categorical data Aalysis 10 / 35

Example 2 (MLE) Iferece for Proportios Figure: Likelihood Fuctio Mahida Samarakoo STAC51: Categorical data Aalysis 11 / 35

Sigificace Tests for biomial parameter (i.e proportios) Let Y Bi(, π). We are iterested i testig the ull hypotheses H 0 : π = π 0. Example We toss a coi = 10 times ad observe y = 3 heads. P(H) = π. Test the ull hypothesis H 0 : π = 0.5 agaist H 1 : π < 0.5 Aswer: p value = P(Y = 3) + P(Y = 2) + P(Y = 1) + P(Y = 0) ( ) ( ) 10 10 = (0.5) 3 (1 0.5) 10 3 + (0.5) 2 (1 0.5) 10 2 3 2 ( ) ( ) 10 10 + (0.5) 1 (1 0.5) 10 1 + (0.5) 0 (1 0.5) 10 1 0 = 0.171875. p-value > 0.05 ad so we do ot reject the ull hypothesis. Note: I this case, p- value = P(Y y obs ) Mahida Samarakoo STAC51: Categorical data Aalysis 12 / 35

Sigificace Tests for biomial parameter (i.e proportios) Example We toss a coi = 10 times ad observe y = 8 heads. P(H) = π. Test the ull hypothesis H 0 : π = 0.6 agaist H 1 : π > 0.6. Aswer: p value = P(Y = 8) + P(Y = 9) + P(Y = 10) ( ) ( ) 10 10 = (0.6) 8 (1 0.6) 10 8 + (0.6) 9 (1 0.6) 10 9 8 9 ( ) 10 + (0.6) 1 0(1 0.6) 10 10 10 = 0.16728. p-value > 0.05 ad so we do ot reject the ull hypothesis. Note: I this case, p- value = P(Y y obs ) Mahida Samarakoo STAC51: Categorical data Aalysis 13 / 35

Sigificace Tests for biomial parameter (i.e proportios) Example We toss a coi = 10 times ad observe y = 8 heads. P(H) = π. Test the ull hypothesis H 0 : π = 0.6 agaist H 1 : π 0.6. I this case we take p-value = 2 mi (P(Y y obs ), P(Y y obs )). I the previous example, we foud that P(Y y obs ) = 0.16728. P(Y y obs ) = P(Y = 8) + P(Y = 7) + + P(Y = 0) ( ) ( ) 10 10 = (0.6) 8 (1 0.6) 10 8 + (0.6) 7 (1 0.6) 1 8 7 ( ) 10 + + (0.6) 0 (1 0.6) 10 0 0 = 0.953642 p-value = 2 mi(0.953642, 0.16728) = 2 0.16728 = 0.33456 > 0.05 ad so we do ot reject the ull hypothesis. Mahida Samarakoo STAC51: Categorical data Aalysis 14 / 35

Sigificace Tests for biomial parameter (i.e proportios) I the p-value for two-tailed test p-value = 2 mi (P(Y y obs ), P(Y y obs )), we iclude y obs i both terms. This sometimes gives p-values greater tha 1. I that case we will take p-value as 1. Example We toss a coi = 10 times ad observe y = 8 heads. P(H) = π. Test the ull hypothesis H 0 : π = 0.76 agaist H 1 : π 0.76. P(Y y obs ) = P(Y = 8) + P(Y = 7) + + P(Y = 0) ( ) ( 10 10 = (0.76) 8 (1 0.76) 10 8 + 8 7 + + = 0.73269 ( 10 0 ) (0.76) 0 (1 0.76) 10 0 ) (0.76) 7 (1 0 Mahida Samarakoo STAC51: Categorical data Aalysis 15 / 35

Sigificace Tests for biomial parameter (i.e proportios) P(Y y obs ) = P(Y = 8) + P(Y = 9) + P(Y = 10) ( ) ( ) 10 10 = (0.76) 8 (1 0.76) 10 8 + (0.76) 9 (1 0 8 9 ( ) 10 + (0.76) 1 0(1 0.76) 10 10 10 = 0.55580. p-value = 2 mi(0.73269, 0.55580) = 2 0.55580 = 1.1116 Mahida Samarakoo STAC51: Categorical data Aalysis 16 / 35

Large sample tests Iferece for Proportios For testig the ull hypothesis H 0 : π = π 0, we ca use the test statistic Z = ˆπ π 0 π 0 (1 π 0 ) (1) uder the ull hypothesis ad for large eough sample size Z approx N(0, 1). This result ca be used to calculate the p-value for the test of H 0 : π = π 0. A large sample 100(1 α) percet cofidece iterval for π is ˆπ(1 ˆπ) give by ˆπ ± z α/2 SE where SE =. Mahida Samarakoo STAC51: Categorical data Aalysis 17 / 35

Example(Agresti): Whe the 2000 Geeral Social Survey asked subjects whether they would be willig to accept cuts i their stadard of livig to protect the eviromet, 344 of 1170 subjects said yes. a) Estimate the populatio proportio who would say yes. b) Coduct a sigificace test to determie whether a majority or miority of the populatio would say yes. Report ad iterpret the p-value. Mahida Samarakoo STAC51: Categorical data Aalysis 18 / 35

The R code below calculates the value test statistic, p-value ad the required cofidece iterval. > # R code for the z-test for a sigle proportio > y <- 344 > <- 1170 > p0 <- 0.5 > alpha <- 0.01 > phat <- y/ > z <- (phat-p0)/sqrt((p0*(1-p0))/) > p_value = 2*(1- porm(abs(z))) > z [1] -14.0914 > p_value [1] 0 > phat [1] 0.2940171 Mahida Samarakoo STAC51: Categorical data Aalysis 19 / 35

The R commad prop.test will also produce the calculatios required to aswer these questios. The commad help(prop.test) will show details of the commad. R chi-square test equivalet to the Z-test. > res<-prop.test(x=344,=1170,cof.level=0.99,correct=f, p= > res 1-sample proportios test without cotiuity correctio data: 344 out of 1170, ull probability 0.5 X-squared = 198.5675, df = 1, p-value < 2.2e-16 alterative hypothesis: true p is ot equal to 0.5 p 0.2940171 > Mahida Samarakoo STAC51: Categorical data Aalysis 20 / 35

Cofidece itervals for Proportios For large sample we have used the formula where SE = ˆπ(1 ˆπ) ˆπ ± z α/2 SE for approximate cofidece itervals. I the above example, ˆπ = 344 1170 = 0.294, ˆπ(1 ˆπ) 0.294(1 0.294) SE = = 1170 = 0.013319 ad the cofidece iterval is 0.294 ± 2.575 0.013319 = (0.2597081, 0.3283261). Mahida Samarakoo STAC51: Categorical data Aalysis 21 / 35

Cofidece itervals for Proportios For large sample we have used the formula ˆπ(1 ˆπ) ˆπ ± z α/2 SE where SE = for approximate cofidece itervals. The above cofidece iterval, kow as Wald s cofidece iterval is based o the approximate Normal distributio for ˆπ. For large eough, Z = ˆπ π π(1 π) N(0, 1) ad so ) π(1 π) π(1 π) P (ˆπ z α/2 < π < ˆπ + z α/2 = 1 α. Wald s method replaces π by ˆπ i the stadard deviatio (i.e. π(1 π) ˆπ(1 ˆπ) ) to get ˆπ ± z α/2. Wald CI ofte has poor performace i categorical data aalysis uless quite large. Example For = 25, y = 0, ˆπ = 0 ad the Wald cofidece iterval is (0, 0). Mahida Samarakoo STAC51: Categorical data Aalysis 22 / 35

Score Cofidece itervals for Proportios(Wilso score cofidece iterval) I the score cofidece iterval (Wilso s score method), we ˆπ π collect the values of π such that Z = π(1 π) z α/2. This method does ot replace π by ˆπ as i Wald cofidece iterval. We fid the upper ad the lower limits of the cofidece iterval by solvig ˆπ π π(1 π) = ±z α/2. Wald s method replaces π by ˆπ i the stadard deviatio (i.e. π(1 π) ˆπ(1 ˆπ) ) to get ˆπ ± z α/2. ( [ ˆπ This iterval is give by ( z α/2 ˆπ(1 ˆπ) [ 1 +zα/2 2 +z 2 α/2 +z 2 α/2 ) + 1 4 ) ( )] + 1 z 2 α/2 2 ± +zα/2 2 )] ( z 2 α/2 +z 2 α/2 Mahida Samarakoo STAC51: Categorical data Aalysis 23 / 35

Score Cofidece itervals for Proportios(Wilso score cofidece iterval) Note 1: The scores cofidece iterval ca also be iterpreted as the set of values of π 0 for which the the p-value for testig the the ull hypothesis H 0 : π = π 0 agaist the two-sided alterative H 1 : π π 0 usig the test statistic Z = ˆπ π 0 π0 (1 π 0 ) is greater tha α Mahida Samarakoo STAC51: Categorical data Aalysis 24 / 35

Score Cofidece itervals for Proportios(Wilso score cofidece iterval) Note 2: The midpoit of the above iterval is ( ) ( ) ˆπ + zα/2 2 + 1 z 2 α/2 2 + zα/2 2 = y + z2 α/2 /2 + zα/2 2 ad for α = 0.05 y + z 2 α/2 /2 + z 2 α/2 = y + 1.962 /2 + 1.96 2 y + 2 + 4. For this reaso some authors cosider y+z2 α/2 /2 +zα/2 2 estimate of π. as a a improved Mahida Samarakoo STAC51: Categorical data Aalysis 25 / 35

Score Cofidece itervals for Proportios: Example For = 25, y = 0, ˆπ = 0 the Wald cofidece iterval was (0, 0). The [ ( scores ) iterval ( is )] ˆπ + 1 z 2 α/2 +zα/2 2 2 ± +zα/2 2 [ ( ) ( )] 1 z α/2 ˆπ(1 ˆπ) + 1 z 2 α/2 +zα/2 2 +zα/2 2 4 +zα/2 2 = [ 0 + 1 2 = (0, 0.133). ( 1.96 2 25+1.96 2 )] ± 1.96 1 25+1.96 2 [0 + 1 4 ( 1.96 2 25+1.96 2 )] Mahida Samarakoo STAC51: Categorical data Aalysis 26 / 35

Agresti ad Coul Cofidece iterval I the score iterval, we saw that π= y+z2 α/2 /2 +zα/2 2 estimate of π tha ˆπ. is a better Agresti ad Coul (1998) suggest replacig ˆπ i the Wald cofidece iterval by π to get π (1 π) π ±z α/2 where = + z 2 α/2. This iterval is called Aggreti-Coull cofidece iterval. Mahida Samarakoo STAC51: Categorical data Aalysis 27 / 35

Likelihood Ratio Test of H 0 : π = π 0 agaist H 1 : π π 0 Let Y Bi(, π) The the likelihood fuctio is l(π) = π y (1 π) y Likelihood Ratio Test of H 0 : π = π 0 agaist H 1 : π π 0, rejects H 0 for small values of Λ = l(π 0 )/l(ˆπ) i.e. if Λ = l(π 0 )/l(ˆπ) is smaller tha some critical value. Wilks (1938) showed that uder the ull hypothesis H 0 : π = π 0, 2 log Λ has a limitig Chi square distributio with 1 degree of freedom, as Note: I this course we use atural logarithm throughout. We will use this limitig distributio i the likelihood ratio test ad for calculatig approximate cofideces based o the likelihood ratio. Mahida Samarakoo STAC51: Categorical data Aalysis 28 / 35

Likelihood Ratio Test of H 0 : π = π 0 agaist H 1 : π π 0 Note 1: Λ = maximum likelihood whe H 0 is true maximum likelihood with o restrictio Note 2: 2 log Λ = 2 log(l(π 0 )/l(ˆπ)) = 2(L 0 L 1 ) where L 0 = log l(π 0 ) ad L 1 = log l(ˆπ). We use log for atural logarithms. The likelihood ratio test rejects H 0 if 2 log Λ = 2(L 0 L 1 ) > χ 2 (α) where χ 2 (α) is the 100(1 α) upper quatile of the chi square distributio with 1 degree of freedom. The likelihood ratio [ test statistic simplifies to] 2(L 0 L 1 ) = 2 y log ˆπ π 0 + ( y) log 1 ˆπ 1 π 0. This ca also be expressed [ as ] 2(L 0 L 1 ) = 2 y log y π 0 + ( y) log y π 0 ad 2(L 0 L 1 ) = 2 observed log ( ) observed fitted Mahida Samarakoo STAC51: Categorical data Aalysis 29 / 35

Likelihood Ratio Test of H 0 : π = π 0 agaist H 1 : π π 0 ; Example A coi was tossed 32 times ad observed 23 heads. Use the likelihood ratio test to test the ull hypothesis H 0 : π = 0.5 agaist H 1 : π 0.5. Solutio : 2 log Λ = 2(L 0 L 1 ) = 2 observed log ( ) observed ( ( ) ( fitted)) = 2 23 log 23 32 0.5 + (32 23) log 32 23 32 32 0.5 = 6.337098101 > χ 2 1 (0.05) = 3.841 ad so we reject the ull hypothesis. Mahida Samarakoo STAC51: Categorical data Aalysis 30 / 35

Likelihood Ratio based Cofidece itervals for π Likelihood based cofidece iterval for π is the set of values of π 0 for which 2(L(π 0 ) L(ˆπ)) < χ 2 1 (α). We ca fid the boudaries of the iterval by solvig the equatio 2(L(π 0 ) L(ˆπ)) = χ 2 1 (α) or 2(L(π 0 ) L(ˆπ)) χ 2 1 (α) = 0. This ofte requires umerical a solutio to this equatio. Mahida Samarakoo STAC51: Categorical data Aalysis 31 / 35

Likelihood Ratio based Cofidece itervals for π: Example A coi was tossed 32 times ad observed 23 heads. Fid a likelihood ratio test based 95% cofidece iterval for π. Solutio: We get the upper ad the lower limits of the likelihood ratio based cofidece iterval by solvig 2(L(π 0 ) L(ˆπ)) χ 2 1 (α) = 0. Substitutig values, the equatio becomes: 2[23 log(π 0 ) + (32 23) log(1 π 0 ) 23 log(23/32) (32 23) log(1 (23/32))] χ 2 1(0.05) = 0. Mahida Samarakoo STAC51: Categorical data Aalysis 32 / 35

Likelihood Ratio based Cofidece itervals for π: Example The R code ad the output below shows the umerical solutio to this equatio ad the likelihood ratio based cofidece iterval > #R code for Likelihood Ratio based Cofidece iterval > # p 12 Aggresti > library(rootsolve) > <- 32 > y <- 23 > phat <- y/ > alpha <- 0.05 > f1 <- fuctio(pi0) { + -2*(y*log(pi0) + (-y)*log(1-pi0)-y*log(phat) -(-y)*log(1-phat)) - qchisq(1-alpha,df=1) + } > uiroot.all(f=f1, iterval=c(0.000001,0.999999)) [1] 0.5501852 0.8535842 > curve(f1, from=0, to=1, xlab="pi0", ylab="f1(pi0)") > ablie(h=0, col="red") Mahida Samarakoo STAC51: Categorical data Aalysis 33 / 35

Likelihood Ratio based Cofidece itervals for π: Example > curve(f1, from=0, to=1, xlab="pi0", ylab="f1(pi0)") > ablie(h=0, col="red") > ablie(v=(seq(0,1,by=0.02)), col="blue", lty="dotted") > ablie(h=(seq(0,170,10)), col="blue", lty="dotted") Mahida Samarakoo STAC51: Categorical data Aalysis 34 / 35

Likelihood Ratio based Cofidece itervals for π: Example Figure: Likelihood Ratio Cofidece Iterval Mahida Samarakoo STAC51: Categorical data Aalysis 35 / 35