n y π y (1 π) n y +ylogπ +(n y)log(1 π).

Size: px
Start display at page:

Download "n y π y (1 π) n y +ylogπ +(n y)log(1 π)."

Transcription

1 Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1

2 Solving for π gives the MLE ˆπ = y/n, the number of successes out of the total number of trials. Taking the 2 nd derivative of L(π) gives L (π) = y π 2 n y (1 π) 2, and so E(L (π)) = E ( Y π 2 + n Y ) (1 π) 2 = nπ π 2 + n nπ (1 π) 2 = n π(1 π). The large sample result is then ˆπ = Y n N ( π, π(1 π) ). n 2

3 Let s consider H 0 : π = π 0 where π 0 is fixed and known (e.g. H 0 : π = 0.5.) The Wald test plugs in the MLE ˆπ = y/n for the unknown π in the large sample variance: ˆπ = Y ( ) ˆπ(1 ˆπ) N π,. n n Recall that se(ˆπ) = So then ˆπ(1 ˆπ) n. Z W = ˆπ π 0 se(ˆπ) = ˆπ π 0 when H 0 is true. Squaring, W = Z 2 W ˆπ(1 ˆπ) n χ 2 1. N(0,1) 3

4 The Score test plugs the null value into the large sample variance ˆπ = Y ( N π, π ) 0(1 π 0 ). n n So then Z S = ˆπ π 0 π 0 (1 π 0 ) n when H 0 is true. Squaring, S = Z 2 S N(0,1) χ

5 Evaluating the log-likelihood at the unconstrained MLE gives L 1 = L(ˆπ) = log n y +ylogˆπ +(n y)log(1 ˆπ). Under the constraint H 0 : π = π 0, the log-likelihood is simply L 0 = L(π 0 ) = log n y +ylogπ 0 +(n y)log(1 π 0 ), (there are no parameters left to maximize the constrained likelihood under!) and so the LRT, plugging Y in for y, ( L = 2(L 0 L 1 ) = 2 Y log ˆπ ) 1 ˆπ +(n Y)log χ 2 1 π 0 1 π 0 when H 0 is true. 5

6 In all three cases, an approximate α = 0.05 significance test of H 0 : π = π 0 is carried out by computing W, S, or L and rejecting if the test statistic is larger than the quantile corresponding to 0.05 right tail probability from a χ 2 1 distribution, i.e. larger than χ 2 1(0.05) = Confidence intervals are obtained by inverting the test statistics. 6

7 Where for art thou, vegetarians? Out of n = 25 students, y = 0 were vegetarians (Agresti, 2002). Assuming binomial data, the 95% CIs found by inverting the Wald, score, and LRT tests are Wald (0, 0) score (0, 0.133) LRT (0, 0.074) The Wald interval is particularly troublesome. Why the difference? for small or large (true, unknown) π the normal approximation for the distribution of ˆπ is pretty bad in small samples. A solution is to consider the exact sampling distribution of ˆπ rather than a normal approximation. 7

8 Exact inference An exact test proceeds as follows. Under H 0 : π = π 0 we know Y bin(n,π 0 ). Values of ˆπ far away from π 0, or equivalently, values of Y far away from nπ 0, indicate that H 0 : π = π 0 is unlikely. Say we reject H 0 if Y < a or Y > b where 0 a < b n. Then we set the type I error at α by requiring P(reject H 0 H 0 is true) = α. That is, P(Y < a π = π 0 ) = α 2 and P(Y > b π = π 0) = α 2. 8

9 However, since Y is discrete, the best we can do is bounding the type I error by choosing a as large as possible such that P(Y < a π = π 0 ) = a 1 i=0 n i π i 0(1 π 0 ) n i < α 2, and b as small as possible such that n P(Y > b π = π 0 ) = n i i=b+1 π i 0(1 π 0 ) n i < α 2. 9

10 For example, when n = 20, H 0 : π = 0.25, and α = 0.05 we have P(Y < 2 π = 0.25) = and P(Y < 3 π = 0.25) = 0.091, so a = 2. Also, P(Y > 9 π = 0.25) = and P(Y > 8 π = 0.25) = 0.041, so b = 9. We reject H 0 : π = 0.25 when Y < 2 or Y > 9. The type I error is bounded: α = P(reject H 0 H 0 is true) 0.05, but in fact this is conservative, P(reject H 0 H 0 is true) = = Nonetheless, this type of exact test can be inverted to obtain exact confidence intervals for π. However, the actual coverage probability is at least as large as 1 α, but typically more. So the procedure errs on the side of being conservative (CI s are bigger than they need to be). 10

11 To obtain the 95% CI from inverting the score test, and from inverting the exact (Clopper-Pearson) test, in R type: > out1 <- prop.test(x=0,n=25,conf.level=0.95,correct=f) > out1$conf.int [1] attr(,"conf.level") [1] 0.95 > out2 <- binom.test(x=0,n=25,conf.level=0.95) > out2$conf.int [1] attr(,"conf.level") [1] 0.95 Comment: I would recommend, wherever possible, the use of exact tests instead of approximate tests. 11

12 Describing Contingency Tables Contingency tables & their distributions Let X and Y be categorical variables measured on an a subject with I and J levels respectively. Each subject sampled will have an associated (X,Y); e.g. (X,Y) = (female, Republican). For the gender variable X, I = 2, and for the political affiliation Y, we might have J = 3. Say n individuals are sampled and cross-classified according to their outcome (X,Y). A contingency table places the raw number of subjects falling into each cross-classification category into the table cells. We call such a table an I J table. 12

13 If we relabel the category outcomes to be integers 1 X I and 1 Y J (i.e. turn our experimental outcomes into random variables), we can simplify notation. In the abstract, a contingency table looks like: n ij Y = 1 Y = 2 Y = J Totals X = 1 n 11 n 12 n 1J n 1+ X = 2 n 21 n 22 n 2J n X = I n I1 n I2 n IJ n I+ Totals n +1 n +2 n +J n = n ++ If subjects are randomly sampled from the population and cross-classified, both X and Y are random and (X,Y) has a bivariate discrete joint distribution. Let π ij = P(X = i,y = j), the probability of falling into the (i,j) th (row,column) in the table. 13

14 From Chapter 2 in Christensen (1997) we have a sample of n = 52 males aged 11 to 30 years with knee operations via arthroscopic surgery. They are cross-classified according to X = 1,2,3 for injury type (twisted knee, direct blow, or both) and Y = 1,2,3 for surgical result (excellent, good, or fair-to-poor). n ij Excellent Good Fair to poor Totals Twisted knee Direct blow Both types Totals n = 52 with theoretical probabilities: π ij Excellent Good Fair to poor Totals Twisted knee π 11 π 12 π 13 π 1+ Direct blow π 21 π 22 π 23 π 2+ Both types π 31 π 32 π 33 π 3+ Totals π +1 π +2 π +3 π ++ = 1 14

15 The marginal probabilities that X = i or Y = j are P(X = i) = P(Y = j) = J P(X = i,y = j) = j=1 I P(X = i,y = j) = i=1 J π ij = π i+. j=1 I π ij = π +j. i=1 A + in place of a subscript denotes a sum of all elements over that subscript. We must have π ++ = I i=1 J π ij = 1. j=1 The counts have a multinomial distribution n mult(n, π) where n = [n ij ] I J and π = [π ij ] I J. 15

16 Often X is fixed and only Y is random. For example in a case-control study, a fixed number of cases are sampled and a fixed number of controls. Then a risk factor or exposure Y is compared among cases and controls within the table. This results in a separate multinomial distribution for each level of X; however we can pretend that the joint distribution is one large multinomial and nothing changes. Example: Aspirin and heart attacks n = 1360 stroke patients randomly assigned to aspirin or placebo & followed about 3 years. Heart attack No heart attack Total Placebo (fixed) Aspirin (fixed) 16

17 Independence When (X,Y) are jointly distributed, X and Y are independent if Let and P(X = i,y = j) = P(X = i)p(y = j) or π ij = π i+ π +j. π X Y i j = P(X = i Y = j) = π ij /π +j π Y X j i = P(Y = j X = i) = π ij /π i+. Then independence of X and Y implies P(X = i Y = j) = P(X = i) and P(Y = j X = i) = P(Y = j). The probability of any given column response is the same for each row. The probability for any given row response is the same for each column. 17

18 Comparing two proportions Let X and Y be dichotomous. Let π 1 = P(Y = 1 X = 1) and let π 2 = P(Y = 1 X = 2). The difference in probability of Y = 1 when X = 1 versus X = 2 is π 1 π 2. The relative risk π 1 /π 2 may be more informative for rare outcomes. However it may also exaggerate the effect of X = 1 versus X = 2 as well and cloud issues. 18

19 Example: Let Y = 1 indicate presence of a disease and X = 1 indicate an exposure. When π 2 = and π 1 = 0.01, π 1 π 2 = However, π 1 /π 2 = 10. You are 10 times more likely to get the disease when X = 1 than X = 2. However, in either case the probability of getting the disease When π 2 = and π 1 = 0.41, π 1 π 2 = However, π 1 /π 2 = You are 2% more likely to get the disease when X = 1 than X = 2. This doesn t seem as drastic as 1000%. These sorts of comparisons figure into reporting results concerning public health and safety information. e.g. HRT for post-menopausal women, relative safety of SUVs versus sedans, etc. 19

20 Odds ratios The odds of success (say Y = 1) versus failure (Y = 2) are Ω = π/(1 π) where π = P(Y = 1). When someone says 3 to 1 odds the Vikings will win, they mean Ω = 3 which implies the probability the Vikings will win is 0.75, from π = Ω/(Ω+1). Odds measure the relative rates of success and failure. An odds ratio compares relatives rates of success (or disease or whatever) across two exposures X = 1 and X = 2: θ = Ω 1 Ω 2 = π 1/(1 π 1 ) π 2 /(1 π 2 ). Odds ratios are always positive and a ratio > 1 indicates the relative rate of success for X = 1 is greater than for X = 2. However, the odds ratio gives no information on the probabilities π 1 = P(Y = 1 X = 1) and π 2 = P(Y = 1 X = 2). 20

21 Different values for these parameters can lead to the same odds ratio. Example: π 1 = & π 2 = 0.5 yield θ = 5.0. So does π 1 = & π 2 = One set of values might imply a different decision than the other, but θ = 5.0 in both cases. Here, the relative risk is about 1.7 and 5 respectively. Note that when dealing with a rare outcome, where π i 0, the relative risk is approximately equal to the odds ratio. 21

22 When θ = 1 we must have Ω 1 = Ω 2 which further implies that π 1 = π 2 and hence Y does not depend on the value of X. If (X,Y) are both random then X and Y are stochastically independent. An important property of odds ratio is the following: θ = = P(Y = 1 X = 1)/P(Y = 2 X = 1) P(Y = 1 X = 2)/P(Y = 2 X = 2) P(X = 1 Y = 1)/P(X = 2 Y = 1) P(X = 1 Y = 2)/P(X = 2 Y = 2) You should verify this formally. This implies that for the purposes of estimating an odds ratio, it does not matter if data are sampled prospectively, retrospectively, or cross-sectionally. The common odds ratio is estimated ˆθ = n 11 n 22 /[n 12 n 21 ]. 22

23 Inference for Contingency Tables Inference for association parameters: odds ratios The sample odds ratio ˆθ = n 11 n 22 /n 12 n 21 can be zero, undefined, or if one or more of {n 11,n 22,n 12,n 21 } are zero. An alternative is to add 1/2 observation to each cell θ = (n )(n )/(n )(n ). This also corresponds to a particular Bayesian estimate. Both ˆθ and θ have skewed sampling distributions with small n = n ++. The sampling distribution of log ˆθ is relatively symmetric and therefore more amenable to a Gaussian approximation. An approximate (1 α) 100% CI for logθ is given by log ˆθ ±zα 2 1 n n n n 22. A CI for θ is obtained by exponentiating the interval endpoints. 23

24 When ˆθ = 0 this doesn t work (log0 = ). Can use n ij +0.5 in place of n ij in MLE estimate and standard error yielding log θ 1 ±zα 2 n n n n Perhaps better approach would involve inverting score or LRT tests for H 0 : θ = t. 24

25 Example: Aspirin and heart attacks n = 1360 stroke patients randomly assigned to aspirin or placebo (product multinomial sampling) & followed about 3 years. Heart attack No heart attack Total Placebo (fixed) Aspirin (fixed) 95% CI for logθ using ˆθ is ( 0.157,1.047) and so the CI for θ is (e 0.157,e ) = (0.85,2.85). We cannot reject that H 0 : θ = 1 (at level α = 0.05). We accept that heart attacks are unrelated to aspirin intake. 25

26 Difference in proportions Assume (1) multinomial sampling or (2) product binomial sampling where n i+ are fixed (fixed row totals as in heart attack data). Let π 1 = P(Y = 1 X = 1) and π 2 = P(Y = 1 X = 2). The sample proportion for each level of X is the MLE ˆπ 1 = n 11 /n 1+, ˆπ 2 = n 21 /n 2+. Using either large sample results or the CLT we have ( ˆπ 1 N π 1, π ) ( 1(1 π 1 ) ˆπ 2 N π 2, π ) 2(1 π 2 ). n 1+ n 2+ Since the difference of two independent normals is also normal, we have ( ˆπ 1 ˆπ 2 N π 1 π 2, π 1(1 π 1 ) + π ) 2(1 π 2 ). n 1+ n 2+ 26

27 Plugging in MLEs for unknowns, we estimate the standard deviation of the difference in sample proportions by the standard error ˆπ 1 (1 ˆπ 1 ) ˆσ(ˆπ 1 ˆπ 2 ) = + ˆπ 2(1 ˆπ 2 ). n 1+ n 2+ A Wald CI for the unknown difference has endpoints ˆπ 1 ˆπ 2 ±zα 2 ˆσ(ˆπ 1 ˆπ 2 ). For the aspirin data, this yields ± 1.96( ) for the 95% CI ( 0.005, 0.033). 27

28 Estimating relative risk Like the odds ratio, the relative risk π 1 /π 2 (0, ) and tends to have a skewed sampling distribution in small samples. Let r = ˆπ 1 /ˆπ 2 be the sample relative risk. Large sample normality implies where logr = logˆπ 1 /ˆπ 2 N(logπ1 /π 2,σ(logr)). σ(logr) = 1 π 1 π 1 n π 2 π 2 n 2+. Plugging in ˆπ i for π i gives the standard error and CIs are obtained as usual for logπ 1 /π 2, then exponentiated to get the CI for π 1 /π 2. Applying this to the heart attack data we obtain a 95% CI for π 1 /π 2 as (0.86,2.75). The probability of a heart attack on placebo is between 0.86 and 2.75 times greater than on aspirin. 28

29 Testing independence in I J tables Assume one mult(n, π) distribution for the whole table. Let π ij = P(X = i,y = j); we must have π ++ = 1. If the table is 2 2, we can just look at H 0 : θ = 1. In general, independence holds if H 0 : π ij = π i+ π +j, or equivalently, µ ij = nπ i+ π +j. That is, independence implies a constraint; the parameters π 1+,...,π I+ and π +1,...,π +J define all probabilities in the I J table under the constraint. 29

30 Pearson s statistic is X 2 = I i=1 J j=1 (n ij ˆµ ij ) 2 ˆµ ij, where ˆµ ij = n(n i+ /n)(n +j /n), the MLE under H 0. There are I 1 free {π i+ } and J 1 free {π +j }. Then IJ 1 [(I 1)+(J 1)] = (I 1)(J 1). When H 0 is true, X 2 χ 2 (I 1)(J 1). 30

31 The LRT statistic boils down to G 2 = 2 I i=1 J n ij log(n ij /ˆµ ij ), j=1 and is also G 2 χ 2 (I 1)(J 1) when H 0 is true. X 2 G 2 p 0. The approximation is better for X 2 than G 2 in smaller samples. The approximation can be okay when some ˆµ ij = n i+ n +j /n are as small as 1, but most are at least 5. When in doubt, use small sample methods. Everything holds for product multinomial sampling too (fixed marginals for one variable)! 31

32 Following up chi-squared tests for independence Rejecting H 0 : π ij = π i+ π +j does not tell us about the nature of the association. Pearson and standardized residuals The Pearson residual is e ij = n ij ˆµ ij ˆµij, where, as before, ˆµ ij = n i+ n +j /n is the estimate under H 0 : X Y. When H 0 : X Y is true, under multinomial sampling e ij N(0,v), where v < 1, in large samples. Note that I i=1 J j=1 e2 ij = X2. 32

33 Standardized Pearson residuals are Pearson residuals divided by their standard error under multinomial sampling. r ij = n ij ˆµ ij ˆµij (1 p i+ )(1 p +j ), where p ij = n ij /n are MLEs under the full (non-independence) model. Values of r ij > 3 happen very rarely when H 0 : X Y is true and r ij > 2 happen only roughly 5% of the time. Pearson residuals and their standardized version tell us which cell counts are much larger or smaller than what we would expect under H 0 : X Y. Example: we analyze data on religious beliefs and highest degree achieved (Agresti, 2002) using the following SAS code: 33

34 Religious beliefs Highest degree Fundamentalist Moderate Liberal Less than high school High school or junior college Bachelor or graduate data table; input Degree$ Religion$ count datalines; 1 fund mod lib fund mod lib fund mod lib 252 ; proc format; value $dc 1 = < HS 2 = HS or JC 3 = >= BA/BS ; value $rc fund = Fundamentalist mod = Moderate lib = Liberal ; proc freq order=data; weight count; format Religion $rc. Degree $dc.; tables Degree*Religion / chisq expected measures cmh1; proc genmod order=data; class Degree Religion; format Religion $rc. Degree $dc.; model count = Degree Religion / dist=poi link=log residuals; run; 34

35 Annotated output from proc freq: Degree Religion Frequency Expected Percent Row Pct Col Pct Fundamen Moderate Liberal Total talist < HS HS or JC >= BA/BS Total

36 More... Statistics for Table of Degree by Religion Statistic DF Value Prob Chi-Square <.0001 Likelihood Ratio Chi-Square <.0001 Annotated output from proc genmod: The GENMOD Procedure Observation Statistics Observation Resraw Reschi Resdev StResdev StReschi Reslik

37 The StReschi column has the r ij. Values larger than 3 in magnitude indicate severe departures from independence. Observations 1 and 9, corresponding to less than high school, fundamentalist and at least BS/BA, liberal are over-represented relative to independence. Observations 6 and 7, corresponding to HS or JC, liberal and at least BS/BA, fundamentalist are under-represented. That is, we tend to see concentrations along the diagonal, so increased education is associated with increasingly liberal religious views. 37

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Categorical Variables and Contingency Tables: Description and Inference

Categorical Variables and Contingency Tables: Description and Inference Categorical Variables and Contingency Tables: Description and Inference STAT 526 Professor Olga Vitek March 3, 2011 Reading: Agresti Ch. 1, 2 and 3 Faraway Ch. 4 3 Univariate Binomial and Multinomial Measurements

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence ST3241 Categorical Data Analysis I Two-way Contingency Tables Odds Ratio and Tests of Independence 1 Inference For Odds Ratio (p. 24) For small to moderate sample size, the distribution of sample odds

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Some comments on Partitioning

Some comments on Partitioning Some comments on Partitioning Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/30 Partitioning Chi-Squares We have developed tests

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Sections 3.4, 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 3.4 I J tables with ordinal outcomes Tests that take advantage of ordinal

More information

One-Way Tables and Goodness of Fit

One-Way Tables and Goodness of Fit Stat 504, Lecture 5 1 One-Way Tables and Goodness of Fit Key concepts: One-way Frequency Table Pearson goodness-of-fit statistic Deviance statistic Pearson residuals Objectives: Learn how to compute the

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

The Multinomial Model

The Multinomial Model The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

2 Describing Contingency Tables

2 Describing Contingency Tables 2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

(c) Interpret the estimated effect of temperature on the odds of thermal distress.

(c) Interpret the estimated effect of temperature on the odds of thermal distress. STA 4504/5503 Sample questions for exam 2 1. For the 23 space shuttle flights that occurred before the Challenger mission in 1986, Table 1 shows the temperature ( F) at the time of the flight and whether

More information

Ordinal Variables in 2 way Tables

Ordinal Variables in 2 way Tables Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Means or "expected" counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv

Means or expected counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv Measures of Association References: ffl ffl ffl Summarize strength of associations Quantify relative risk Types of measures odds ratio correlation Pearson statistic ediction concordance/discordance Goodman,

More information

3 Way Tables Edpsy/Psych/Soc 589

3 Way Tables Edpsy/Psych/Soc 589 3 Way Tables Edpsy/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring 2017

More information

Chapter 11: Models for Matched Pairs

Chapter 11: Models for Matched Pairs : Models for Matched Pairs Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3 Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Chapter 11: Analysis of matched pairs

Chapter 11: Analysis of matched pairs Chapter 11: Analysis of matched pairs Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 42 Chapter 11: Models for Matched Pairs Example: Prime

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21 Sections 2.3, 2.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 21 2.3 Partial association in stratified 2 2 tables In describing a relationship

More information

Homework 1 Solutions

Homework 1 Solutions 36-720 Homework 1 Solutions Problem 3.4 (a) X 2 79.43 and G 2 90.33. We should compare each to a χ 2 distribution with (2 1)(3 1) 2 degrees of freedom. For each, the p-value is so small that S-plus reports

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Topics on Statistics 3

Topics on Statistics 3 Topics on Statistics 3 Pejman Mahboubi April 24, 2018 1 Contingency Tables Assume we ask a sample of 1127 Americans if they believe in an afterlife world. The table below cross classifies the sample based

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses

More information

Three-Way Contingency Tables

Three-Way Contingency Tables Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep

More information

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM STAT 301, Fall 2011 Name Lec 4: Ismor Fischer Discussion Section: Please circle one! TA: Sheng Zhgang... 341 (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan... 345 (W 1:20) / 346 (Th

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Modeling and inference for an ordinal effect size measure

Modeling and inference for an ordinal effect size measure STATISTICS IN MEDICINE Statist Med 2007; 00:1 15 Modeling and inference for an ordinal effect size measure Euijung Ryu, and Alan Agresti Department of Statistics, University of Florida, Gainesville, FL

More information

Review of One-way Tables and SAS

Review of One-way Tables and SAS Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

Introduction. Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University

Introduction. Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University Introduction Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 56 Course logistics Let Y be a discrete

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

Solutions for Examination Categorical Data Analysis, March 21, 2013

Solutions for Examination Categorical Data Analysis, March 21, 2013 STOCKHOLMS UNIVERSITET MATEMATISKA INSTITUTIONEN Avd. Matematisk statistik, Frank Miller MT 5006 LÖSNINGAR 21 mars 2013 Solutions for Examination Categorical Data Analysis, March 21, 2013 Problem 1 a.

More information

COPYRIGHTED MATERIAL. Introduction CHAPTER 1

COPYRIGHTED MATERIAL. Introduction CHAPTER 1 CHAPTER 1 Introduction From helping to assess the value of new medical treatments to evaluating the factors that affect our opinions on various controversial issues, scientists today are finding myriad

More information

Chapter 5: Logistic Regression-I

Chapter 5: Logistic Regression-I : Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response) Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV

More information

Chapter 11. Hypothesis Testing (II)

Chapter 11. Hypothesis Testing (II) Chapter 11. Hypothesis Testing (II) 11.1 Likelihood Ratio Tests one of the most popular ways of constructing tests when both null and alternative hypotheses are composite (i.e. not a single point). Let

More information

Practical Considerations Surrounding Normality

Practical Considerations Surrounding Normality Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Short Course Introduction to Categorical Data Analysis

Short Course Introduction to Categorical Data Analysis Short Course Introduction to Categorical Data Analysis Alan Agresti Distinguished Professor Emeritus University of Florida, USA Presented for ESALQ/USP, Piracicaba Brazil March 8-10, 2016 c Alan Agresti,

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Section Poisson Regression

Section Poisson Regression Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. risk factors 2. risk

More information

Lecture 41 Sections Mon, Apr 7, 2008

Lecture 41 Sections Mon, Apr 7, 2008 Lecture 41 Sections 14.1-14.3 Hampden-Sydney College Mon, Apr 7, 2008 Outline 1 2 3 4 5 one-proportion test that we just studied allows us to test a hypothesis concerning one proportion, or two categories,

More information

Analysis of Categorical Data Three-Way Contingency Table

Analysis of Categorical Data Three-Way Contingency Table Yu Lecture 4 p. 1/17 Analysis of Categorical Data Three-Way Contingency Table Yu Lecture 4 p. 2/17 Outline Three way contingency tables Simpson s paradox Marginal vs. conditional independence Homogeneous

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

1 Comparing two binomials

1 Comparing two binomials BST 140.652 Review notes 1 Comparing two binomials 1. Let X Binomial(n 1,p 1 ) and ˆp 1 = X/n 1 2. Let Y Binomial(n 2,p 2 ) and ˆp 2 = Y/n 2 3. We also use the following notation: n 11 = X n 12 = n 1 X

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

Statistics for Managers Using Microsoft Excel

Statistics for Managers Using Microsoft Excel Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Chapter 4: Generalized Linear Models-II

Chapter 4: Generalized Linear Models-II : Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Example. χ 2 = Continued on the next page. All cells

Example. χ 2 = Continued on the next page. All cells Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Chapter 8. The analysis of count data. This is page 236 Printer: Opaque this

Chapter 8. The analysis of count data. This is page 236 Printer: Opaque this Chapter 8 The analysis of count data This is page 236 Printer: Opaque this For the most part, this book concerns itself with measurement data and the corresponding analyses based on normal distributions.

More information

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests

More information