Review of One-way Tables and SAS

Size: px
Start display at page:

Download "Review of One-way Tables and SAS"

Transcription

1 Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from To calculate p-value for a X 2 or G 2 in SAS: use PROBCHI function. For example, if X 2 = 0.47 with df = 2, then the p-value=1-probchi(0.47,2)

2 Stat 504, Lecture 7 2 Introduction to Two-Way Tables Example 1: 2 2 Table of counts and/or proportions Table 1: Incidence of Common Colds involving French Skiers (Pauling(1971) as reported in Fienberg(1980) Cold No Cold Totals Placebo Absorbic Acid Totals

3 Stat 504, Lecture 7 3 Table 2: Incidence of Common Colds involving French Skiers (Pauling(1971) as reported in Fienberg(1980) Cold No Cold Totals Placebo Absorbic Acid Totals Q1: Compare relative frequency of occurrence of some characteristics of two groups, e.g. is a probability of a member of the placebo group contracting a cold same as a probability of a member for the ascorbic group contracting a cold? Q2: Are two characteristics independent, e.g. are a type of treatment and contracting cold associated? Q3: Is one characteristic a cause for another, e.g. does having a therapeutic value of ascorbic acid (vitamin C) prevent contracting a cold?

4 Stat 504, Lecture 7 4 Suppose that we collect data on two binary variables, Y and Z. Binary means that these variables take two possible values, say 1 (e.g. cold ) and 2 (e.g. no cold ). Suppose we collect values of Y (e.g. treatment) and Z (e.g. contracting cold) for n sample units. The data then consist of n pairs, (y 1, z 1 ), (y 2, z 2 ),..., (y n, z n ). We can summarize the data in a frequency table. Let x ij be the number of sample units having Y = i and Z = j. Then x = (x 11, x 12, x 21, x 22 ) is a summary of all n responses, e.g x 11 = 31. We could display x as a one-way table with four cells, but it is customary to display x as a square table with two rows and two columns: Z = 1 Z = 2 Y = 1 x 11 x 12 Y = 2 x 21 x 22

5 Stat 504, Lecture 7 5 Marginal totals. When a subscript in a cell count x ij is replaced by a plus sign (+), it will mean that we have taken the sum of the cell counts over that subscript. The row totals are the column totals are and the grand total is x 1+ = x 11 + x 12, x 2+ = x 21 + x 22, x +1 = x 11 + x 21, x +2 = x 12 + x 22, x ++ = x 11 + x 12 + x 21 + x 22 = n. These quantities are often called marginal totals, because they are conveniently placed in the margins of the table, like this. Z = 1 Z = 2 total Y = 1 x 11 x 12 x 1+ Y = 2 x 21 x 22 x 2+ total x +1 x +2 x ++

6 Stat 504, Lecture 7 6 If the sample units are randomly sampled from a large population, then x = (x 11, x 12, x 21, x 22 ) will have a multinomial distribution with index n = x ++ and parameter vector π = (π 11, π 12, π 21, π 22 ), where π ij = P (Y = i, Z = j). Z = 1 Z = 2 total Y = 1 π 11 π 12 π 1+ Y = 2 π 21 π 22 π 2+ total π +1 π +2 π ++ = 1 The probability distribution {π ij } is the joint distribution of Y and Z. When you sum the joint probabilities, you get a marginal distribution, e..g the probability distribution {π i+ } is the marginal distribution for Y where P (Y = 1) = π 1+ and P (Y = 2) = π 2+. How does the distribution of Z change as the category of Y changes? The conditional distribution of Z given Y, for example, is {π j i } = π ij π i+, such that P j π j i = 1.

7 Stat 504, Lecture 7 7 In class exercise: What is the observed conditional probability distribution P ( cold treatment )?

8 Stat 504, Lecture 7 8 Under a general multinomial model, the π vector contains three unknown parameters. The general multinomial model is often called the saturated model, because it contains the maximum number of unknown parameters. Explore geometry of 2 2 tables: 2.cs.cmu.edu/ eairoldi/tetrahedron3d/

9 Stat 504, Lecture 7 9 The independence model Given a 2 2 table, it is natural to ask how Y and Z are related. Suppose for the moment that there is no relationship between Y and Z, i.e. that they are independent. Independence means that π ij = P (Y = i, Z = j) = P (Y = i) P (Z = j) for i, j = 1, 2. Let P (Y = 1) = α and P (Z = 1) = β, so that P (Y = 2) = 1 α and P (Z = 2) = 1 β. Under independence, we have π 11 = P (Y = 1) P (Z = 1) = αβ, (1) π 12 = P (Y = 1) P (Z = 2) = α(1 β), (2) π 21 = P (Y = 2) P (Z = 1) = (1 α)β, (3) π 22 = P (Y = 2) P (Z = 2) = (1 α)(1 β).(4)

10 Stat 504, Lecture 7 10 Note that α = π 1+ = π 11 + π 12, 1 α = π 2+ = π 21 + π 22, β = π +1 = π 11 + π 21, 1 β = π +2 = π 12 + π 22, so the condition of independence can be conveniently written as π ij = π i+ π +j, i, j = 1, 2. (5) The primary reason that we introduced the symbols α and β for π 1+ and π +1 is to emphasize that under the independence model, there are only two unknown parameters. Once α and β are known, the vector π can be found using (1) (4). The independence model is a submodel of (i.e. a special case of) the saturated model that satisfies the constraints (5).

11 Stat 504, Lecture 7 11 Test of independence The hypothesis of independence can be tested using the general method described in Lecture 4. To test H 0 : the independence model is true versus H 1 : the saturated model is true, do the following. First, estimate α and β, the unknown parameters of the independence model. Second, calculate estimated cell probabilities and expected frequencies from the estimated α and β. Third, calculate X 2 and/or G 2 and compare them to the appropriate chisquare distribution.

12 Stat 504, Lecture 7 12 How can we estimate α and β? Under H 0, Y (e.g. treatment ) and Z (e.g. cold ) provide no information about one another, so we can estimate the parameters of their distributions separately. Note that x 1+ Bin(n, α) (6) and x +1 Bin(n, β), (7) and under H 0 (6) and (7) are independent.

13 Stat 504, Lecture 7 13 Therefore, the ML estimates of α and β are ˆα = x 1+ n and ˆβ = x +1 n. Plugging these estimates into (1) (4) gives estimated probabilities ˆπ 11 = x 1+ n ˆπ 21 = x 2+ n x +1 n, ˆπ 12 = x 1+ n x +1 n, ˆπ 22 = x 2+ n x +2 n, x +2 n, and estimated expected cell counts E 11 = nˆπ 11 = x 1+x +1 n E 21 = nˆπ 21 = x 2+x +1 n, E 12 = nˆπ 12 = x 1+x +2 n, E 22 = nˆπ 22 = x 2+x +2 n These four formulas are conveniently summarized as,. E ij = x i+x +j n, i, j = 1, 2, which can be easily remembered as expected frequency = row total column total. grand total

14 Stat 504, Lecture 7 14 Under H 0, both X 2 and G 2 are approximately χ 2 provided that the expected counts E ij are sufficiently large. Under H 0 the model has 2 unknown parameters, whereas under H 1 there are 3 unknowns. The degrees of freedom are therefore ν = 3 2 = 1. A large value of X 2 or G 2 indicates that the independence model is not plausible, and thus that Y and Z are related. The 95th percentile of χ 2 1 is 3.96, so an observed value of X 2 or G 2 greater than 4 means that we can reject the null hypothesis of independence at the.05 level.

15 Stat 504, Lecture 7 15 The test for independence in a 2 2 table is a special case of the general goodness-of-fit test discussed in Lecture 5 and 6. Therefore, all of the caveats regarding goodness-of-fit tests discussed there apply to this test also. For the chisquare approximation to work well, the E ij s need to be sufficiently large. The iid assumption for the n sample units must be satisfied; there should be no clustering in the data.

16 Stat 504, Lecture 7 16 Example. Suppose that in a sample of n = 300 hospital patients, 90 are overweight, 90 are hypertensive, and 30 are both overweight and hypertensive. Is there evidence of a relationship between these two conditions? The observed data are shown below. not hypertensive hypertensive total overweight not overweight total The expected cell counts for the four cells are E 11 = E 21 = = 27, E 12 = = 63, E 22 = The goodness-of-fit statistics are = 63, = 147. X 2 = (30 27) ( )2 147 (60 63)2 63 = 0.68, + (60 63)2 63

17 Stat 504, Lecture 7 17 G 2 = 2 30 log log log 63 «= log These do not exceed 4, so we cannot reject the independence model at the.05 level. An approximate p-value is P (χ ) =.40. On the basis of these data, there is little evidence of a relationship between the two conditions.

18 Stat 504, Lecture 7 18 The test for independence in a 2 2 table can be done in Minitab using the chisq command: MTB > read c1-c2 DATA> DATA> DATA> end 2 rows read. MTB > chisq c1-c2 Expected counts are printed below observed counts C1 C2 Total Total ChiSq = = df = 1 Note that Minitab gives only Pearson s X 2. Calculating the deviance G 2 in Minitab is a little more tedious. One way to do it is to enter the cell counts in a single column, say, C1. Then enter the row sums and column sums in C2 and C3, respectively. Then calculate the expected cell counts and put them

19 Stat 504, Lecture 7 19 into C4. MTB > set c1 # enter observed counts DATA> DATA> end MTB > set c2 # enter row sums DATA> DATA> end MTB > set c3 # enter column sums DATA> DATA> end MTB > let c4 = c2*c3/300 # calculate expected counts MTB > let k1 = 2*sum(c1*log(c1/c4)) # calculate G^2 MTB > print k1 K

20 Stat 504, Lecture 7 20 In R or S-PLUS the Pearson X 2 -test is easily carried out using the chisq() function. By default, this function employs the continuity correction proposed by Yates (1934) for a 2 2 table. This correction is not universally regarded as appropriate, however, so we will not use it. To turn off the Yates correction, include correct=f as an argument to the chisq() function. > x_c(30,60,60,150) # enter data > x_matrix(x,2,2) # convert to a matrix > chisq.test(x,correct=f) Pearson s chi-square test without Yates continuity correction data: x X-squared = , df = 1, p-value = To calculate G 2 in R or S-PLUS, you need to go through essentially the same steps as in Minitab. > ob_c(30,60,60,150) > rsum_c(90,90,210,210) > csum_c(90,210,90,210) > ex_rsum*csum/300 > G2_2*sum(ob*log(ob/ex)) > G2 [1]

21 Stat 504, Lecture 7 21 In SAS the function under PROC FREQ is chisq and for two-way tables and above will give you both the Pearson X 2 statistic and the deviance, G 2. See:

22 Stat 504, Lecture 7 22 Multinomial sampling: In one type of experiment, we draw a sample of n = x ++ subjects from a population and record (Y, Z) for each subject. Then the joint distribution of {x ij } is multinomial with index n and parameter π = {π ij }, π ij = P (Y = i, Z = j). Where the grand total n is fixed and known. Sometimes we express the parameter as the cell means m ij = E(x ij ) = nπ ij.

23 Stat 504, Lecture 7 23 Poisson sampling: x ij Poisson(m ij ) independently for i = 1,..., I and j = 1,..., J. In this scheme, the overall n is not fixed. Example: You sit by the roadside for one hour with a radar gun, checking the speed of each car as it passes by. You record Y = color of the car (1=black, 2=white, 3=red, 4=other) and Z = whether the car s speed exceeds the legal limit (1=yes, 2=no).

24 Stat 504, Lecture 7 24 In Lecture 4, we argued that the likelihood function may be factored into the product of a Poisson likelihood for n, n Poisson(m ++ ) and a multinomial likelihood for {x ij } given n, with parameters π ij = m ij m ++. The total n provides no information about π = {π ij }. From a likelihood standpoint, we get the same inferences about π whether n is regarded as fixed or random. Therefore, if m ++ is not of interest, Poisson data may be analyzed as if it were multinomial. Conversely, if data are multinomial, we may analyze them as if they were Poisson. The inferences for π are valid, and the inferences for m ++ should be ignored.

25 Stat 504, Lecture 7 25 Product-multinomial sampling: Decide beforehand that we will draw x i+ subjects with characteristic Y = i (i = 1,..., I) and record the Z-value for each one. In this scenario, each row of the table (x i1, x i2,..., x ij ) T is multinomial with probabilities π j i = π ij /π i+ and the rows are independent. Viewing the data as product-multinomial is appropriate when the row totals truly are fixed by design, as in stratified random sampling (strata defined by Y ) an experiment where Y =treatment group It s also appropriate when the row totals are not fixed, but we are interested in P (Z Y ) and not P (Y ). That is, when Z is the outcome of interest, and Y is an explanatory variable that we do not wish to model.

26 Stat 504, Lecture 7 26 Suppose the data are multinomial. Then by results from Lecture 4, we may factor the likelihood into two parts: a multinomial likelihood for the row totals (x 1+, x 2+,..., x I+ ) T with index n and parameter {π i+ } I independent multinomial likelihoods for the rows, (x i1, x i2,..., x ij ) T, with parameters {π j i = π ij /π i+ }. Therefore, if the parameters of interest to us can be expressed as functions only of the π j i s and not the π i+ s, then correct likelihood-based inferences may be obtained by treating the data as if they were product-multinomial. Conversely, if the data are product-multinomial, then correct likelihood-based inferences about functions of the π j i s will be obtained if we analyze the data as if they were multinomial. We may also treat them as Poisson, ignoring any inferences about m ++ or m i+.

27 Stat 504, Lecture 7 27 Hypergeometric sampling: In a few rare examples, we may encounter data where both the row totals (x 1+,..., x I+ ) T and the column totals (x +1,..., x +J ) T are fixed by design. The best-known example of this is Fisher s hypothetical example of the lady tasting tea, which we will discuss soon. In a 2 2 table, the resulting sampling distribution is hypergeometric. Even when both sets of marginal totals are not fixed by design, some statisticians like to condition on them and perform exact inference when the sample size is small and asymptotic approximations are unlikely to work well. Methods for exact inference will be discussed later.

28 Stat 504, Lecture 7 28 Next lecture: Suggested reading: Ch.2 and Ch. 3 of Agresti Next week we ll cover the test of independence, measures of association and exact tests for 2 2 and I J tables There is no regular homework assignment due next week. However, there is an EXTRA credit assignment due on Tuesday, Feb. 8, For the French skier example, are two variables independent; i.e. are the treatment and response independent? 2. What seems to be the most reasonable sampling scheme for this problem?; e.g. if you are to design the study which sampling model discussed today would you apply and why? 3. Read the on-line information (example) on analysis of 2 2 tables in SAS (see slide 21). Run the analysis of the overweight example in SAS. Submit your code and compare your results to what we got in class today. What s the most appropriate sampling model for this example and why?

One-Way Tables and Goodness of Fit

One-Way Tables and Goodness of Fit Stat 504, Lecture 5 1 One-Way Tables and Goodness of Fit Key concepts: One-way Frequency Table Pearson goodness-of-fit statistic Deviance statistic Pearson residuals Objectives: Learn how to compute the

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Sections 3.4, 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 3.4 I J tables with ordinal outcomes Tests that take advantage of ordinal

More information

Section VII. Chi-square test for comparing proportions and frequencies. F test for means

Section VII. Chi-square test for comparing proportions and frequencies. F test for means Section VII Chi-square test for comparing proportions and frequencies F test for means 0 proportions: chi-square test Z test for comparing proportions between two independent groups Z = P 1 P 2 SE d SE

More information

Categorical Variables and Contingency Tables: Description and Inference

Categorical Variables and Contingency Tables: Description and Inference Categorical Variables and Contingency Tables: Description and Inference STAT 526 Professor Olga Vitek March 3, 2011 Reading: Agresti Ch. 1, 2 and 3 Faraway Ch. 4 3 Univariate Binomial and Multinomial Measurements

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Solutions for Examination Categorical Data Analysis, March 21, 2013

Solutions for Examination Categorical Data Analysis, March 21, 2013 STOCKHOLMS UNIVERSITET MATEMATISKA INSTITUTIONEN Avd. Matematisk statistik, Frank Miller MT 5006 LÖSNINGAR 21 mars 2013 Solutions for Examination Categorical Data Analysis, March 21, 2013 Problem 1 a.

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response) Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Topic 21 Goodness of Fit

Topic 21 Goodness of Fit Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

The Multinomial Model

The Multinomial Model The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient

More information

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

The material for categorical data follows Agresti closely.

The material for categorical data follows Agresti closely. Exam 2 is Wednesday March 8 4 sheets of notes The material for categorical data follows Agresti closely A categorical variable is one for which the measurement scale consists of a set of categories Categorical

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Inference for Categorical Data. Chi-Square Tests for Goodness of Fit and Independence

Inference for Categorical Data. Chi-Square Tests for Goodness of Fit and Independence Chi-Square Tests for Goodness of Fit and Independence Chi-Square Tests In this course, we use chi-square tests in two different ways The chi-square test for goodness-of-fit is used to determine whether

More information

Multiple Sample Categorical Data

Multiple Sample Categorical Data Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

10.2: The Chi Square Test for Goodness of Fit

10.2: The Chi Square Test for Goodness of Fit 10.2: The Chi Square Test for Goodness of Fit We can perform a hypothesis test to determine whether the distribution of a single categorical variable is following a proposed distribution. We call this

More information

Module 10: Analysis of Categorical Data Statistics (OA3102)

Module 10: Analysis of Categorical Data Statistics (OA3102) Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T.

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T. Exam 3 Review Suppose that X i = x =(x 1,, x k ) T is observed and that Y i X i = x i independent Binomial(n i,π(x i )) for i =1,, N where ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T x) This is called the

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These

More information

Chi-Squared Tests. Semester 1. Chi-Squared Tests

Chi-Squared Tests. Semester 1. Chi-Squared Tests Semester 1 Goodness of Fit Up to now, we have tested hypotheses concerning the values of population parameters such as the population mean or proportion. We have not considered testing hypotheses about

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Mathematical statistics

Mathematical statistics November 15 th, 2018 Lecture 21: The two-sample t-test Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Chi Square Analysis M&M Statistics. Name Period Date

Chi Square Analysis M&M Statistics. Name Period Date Chi Square Analysis M&M Statistics Name Period Date Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence ST3241 Categorical Data Analysis I Two-way Contingency Tables Odds Ratio and Tests of Independence 1 Inference For Odds Ratio (p. 24) For small to moderate sample size, the distribution of sample odds

More information

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should 11 CHI-SQUARED Chapter 11 Chi-squared Objectives After studying this chapter you should be able to use the χ 2 distribution to test if a set of observations fits an appropriate model; know how to calculate

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

Chi-Square. Heibatollah Baghi, and Mastee Badii

Chi-Square. Heibatollah Baghi, and Mastee Badii 1 Chi-Square Heibatollah Baghi, and Mastee Badii Different Scales, Different Measures of Association Scale of Both Variables Nominal Scale Measures of Association Pearson Chi-Square: χ 2 Ordinal Scale

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009 Variance Estimates and the F Ratio ERSH 8310 Lecture 3 September 2, 2009 Today s Class Completing the analysis (the ANOVA table) Evaluating the F ratio Errors in hypothesis testing A complete numerical

More information

Study Ch. 13.1, # 1 4 all Study Ch. 13.2, # 9 15, 25, 27, 31 [# 11 17, ~27, 29, ~33]

Study Ch. 13.1, # 1 4 all Study Ch. 13.2, # 9 15, 25, 27, 31 [# 11 17, ~27, 29, ~33] GOALS: 1. Learn the properties of the χ 2 Distribution. 2. Understand how the shape of the χ 2 Distribution changes as the df increases. 3. Be able to find p values. 4. Recognize that χ 2 tests are right

More information

Precept 4: Hypothesis Testing

Precept 4: Hypothesis Testing Precept 4: Hypothesis Testing Soc 500: Applied Social Statistics Ian Lundberg Princeton University October 6, 2016 Learning Objectives 1 Introduce vectorized R code 2 Review homework and talk about RMarkdown

More information

Some comments on Partitioning

Some comments on Partitioning Some comments on Partitioning Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/30 Partitioning Chi-Squares We have developed tests

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

:the actual population proportion are equal to the hypothesized sample proportions 2. H a

:the actual population proportion are equal to the hypothesized sample proportions 2. H a AP Statistics Chapter 14 Chi- Square Distribution Procedures I. Chi- Square Distribution ( χ 2 ) The chi- square test is used when comparing categorical data or multiple proportions. a. Family of only

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Introduction to Statistical Analysis. Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core]

Introduction to Statistical Analysis. Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core] Introduction to Statistical Analysis Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core] 2 Timeline 9:30 Morning I I 45mn Lecture: data type, summary

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

n y π y (1 π) n y +ylogπ +(n y)log(1 π). Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives

More information

Statistics - Lecture 04

Statistics - Lecture 04 Statistics - Lecture 04 Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/2018-2019/cours/04/index.html#40 1/40 Correlation In many situations the objective

More information

Chapter 5: Logistic Regression-I

Chapter 5: Logistic Regression-I : Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 TIME: 3 hours. Total marks: 80. (Marks are indicated in margin.) Remember that estimate means to give an interval estimate.

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

Ordinal Variables in 2 way Tables

Ordinal Variables in 2 way Tables Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Chapter 8. The analysis of count data. This is page 236 Printer: Opaque this

Chapter 8. The analysis of count data. This is page 236 Printer: Opaque this Chapter 8 The analysis of count data This is page 236 Printer: Opaque this For the most part, this book concerns itself with measurement data and the corresponding analyses based on normal distributions.

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

10: Crosstabs & Independent Proportions

10: Crosstabs & Independent Proportions 10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church

More information

ML Testing (Likelihood Ratio Testing) for non-gaussian models

ML Testing (Likelihood Ratio Testing) for non-gaussian models ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l

More information

Hypothesis Testing One Sample Tests

Hypothesis Testing One Sample Tests STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

BMI 541/699 Lecture 22

BMI 541/699 Lecture 22 BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

Chapter 11: Analysis of matched pairs

Chapter 11: Analysis of matched pairs Chapter 11: Analysis of matched pairs Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 42 Chapter 11: Models for Matched Pairs Example: Prime

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Is the cholesterol concentration in blood related to the body mass index (bmi)?

Is the cholesterol concentration in blood related to the body mass index (bmi)? Regression problems The fundamental (statistical) problems of regression are to decide if an explanatory variable affects the response variable and estimate the magnitude of the effect Major question:

More information

Stat 5421 Lecture Notes Simple Chi-Square Tests for Contingency Tables Charles J. Geyer March 12, 2016

Stat 5421 Lecture Notes Simple Chi-Square Tests for Contingency Tables Charles J. Geyer March 12, 2016 Stat 5421 Lecture Notes Simple Chi-Square Tests for Contingency Tables Charles J. Geyer March 12, 2016 1 One-Way Contingency Table The data set read in by the R function read.table below simulates 6000

More information