HOW TO USE PROC CATMOD IN ESTIMATION PROBLEMS

Size: px
Start display at page:

Download "HOW TO USE PROC CATMOD IN ESTIMATION PROBLEMS"

Transcription

1 , HOW TO USE PROC CATMOD IN ESTIMATION PROBLEMS Olaf Gefeller 1, Franz Woltering2 1 Abteilung Medizinische Statistik, Georg-August-Universitat Gottingen 2Fachbereich Statistik, Universitat Dortmund Abstract The paper describes a new application of the statistical analysis procedure PROC CATMOD. It demonstrates how PROC CATMOD can be easily adapted to estimation problems in contingency tables. In particular, it is shown how estimators of measures of association and their asymptotic variances can be calculated using PROC CATMOD. The practical application of this approach is illustrated for the kappa-coefficient. Real data of the German Forest Decline Survey are provided in the practical example. eywords: GS-model, kappa-coefficient, measures of association, variance calculation 1. Introduction PROC CATMOD, one of the statistical analysis procedures within the SAS/STAT* software, is originally designed in the context of the Grizzle, Starmer & och (1969) (subsequently abbreviated GS) approach to fit linear models to functions of response frequencies in contingency tables as used in linear modeling, log-linear modeling, logistic regression, and repeated measurement analysis. The GS-approach is typically applied to the analysis of contingency tables with ordered responses (Williams & Grizzle, 1972), measurement of observer agreement (Landis & och, 1977), the analysis of repeated mesurement experiments (och et al., 1977), and rank function analysis (Semenya et al., 1983). In all these applications PROC CATMOD has been successfully employed to solve the computational problems of the analysis. It represents a complex but powerful tool, which needs some experience to handle (and patience to struggle through more than 100 pages of documentation in the SAS/STAT* manual). In this paper we demonstrate that it can be easily adapted to estimation problems in contingency tables. For a broad class of estimators of measures of association under the multinomial sampling model we show how to use PROC CATMOD to calculate the estimator of the measure of association and - more importantly - the asymptotic variance of the estimator. The statistical background of our method, the link between the GS-approach and the estimation of measures of association, will not be presented here (see Gefeller & Woltering, 1991). To illustrate the practical application of the procedure the well-known kappa-coefficient, a measure of agreement between two different ratings, is used, and data of the German Forest Decline Survey (rahl-urban et al., 1988) are presented. Further general remarks on the proposed method are provided in the concluding section. 716

2 2. Methods Adapting the GS-approach to estimation problems in contingency tables, complex ratio statistics such as measures of association have to be written as special functions of the probability estimates of the underlying product multinomial model. To generate estimators of this type, compounded functions involving only linear, logarithmic, and exponential transformations (Forthofer & och, 1973) of the general form F(p) =... As [exp (~ ~n (A3 [exp (A2 ~~ (AlP)])])])], where Ai denotes a matrix with constants, are employed. This general framework offers the opportunity to compute complex estimators in which probabilities from different subpopuiatiolls are Combined~ In the special situation of a single multinomial population achieved by unrestricted sampling of elements measures of association can be expressed in the same as illustrated above. The advantage derived in this situation lies in the chance of using standard software for GS-models such as PROC CATMOD to calculate the estimators and their asymptotic variances. The device consists of fitting the simple linear model F(p) = X{3, where F(p) is the I-dimensional response function as specified above, X denotes the degenerated 1 x 1 matrix consisting of the constant' 1', and {3 represents the I-dimensional parameter. Then the estimator b of (3 equals the response function F itself and the variance of b is given by v" = VF. This analysis is possible in PROC CATMOD by directly specifying the design matix on the MODEL statement and by using the RESPONSE statement to describe a series of transformations to the probability estimates in order to produce F(p), the function of interest. At first glance this procedure seems to increase the computational effort by introducing the tedious work of constructing a series of transformations to describe the measure of association. But, in fact, the most annoying part of the computational task, the calculation of the asymptotic variance through computation of the first derivative matrix and of additional matrix products, is completely undertaken by the computer program. Thus the computational effort for the user is reduced substantially. 3. Practical Example To illustrate the pra~tical application of the 'method the well-known kappa-coeffi~ient (Cohen, 1960)) is used. Cohen's kappa constitutes a popular measure of agreement between two different ratings. It is defined for quadratic x contingency tables. Using the usual row-column parametrization of cell probabilities in contingency tables, which will be denoted as 7rij, i,j = 1,...,, the kappa-coefficient is defined as,:= 2: 7rii - 2: 7ri.7r.i i=l i=l 1-2: 7ri.7r.i i=l The term 2: 7ri.7r.i represents the expected value of agreement,under the hypothesis of independent ratings. i=l ' ' Procedures for the estimation of the kappa-coefficient and its asymptotic variance are not available in standard statistical sqftware packages. To use the SAS/STAT* procedure CATMOD to do the calculations in the way outlined above, the following steps have to be applied:, \ " \,,; 717

3 (1) specify, as a function of a suitable vector of probabilities 7rij (2) transform, to a compounded function involving only linear, logarithmic and exponential operations (3) set up the 'dummy' MODEL statement consisting of the constant '1' as the design matrix (4) set up the RESPONSE statement using the transformation constructed in (2) (5) run PROC CATMOD and look for the 'Analysis of weighted-least-squares estimates'-table in the output, where the estimated value of, and its asymptotic standard error appears (in addition, the estimated asymptotic variance of, can be obtained directly by specifying the 'COVB'-option on the MODEL statement) As a numerical example we use data from the Forest Decline Survey (rahl-urban et al., 1988)). The data based on the variable 'loss of needles' (in percent), which has been categorized independently by two observers into four groups according to the severity of damage, are presented in the following 4 x 4 contingency table: L: L: Now, step 1 involves only the definition that the vector 7r E R16 is build up using the row-column cell probabilities 7rij, i, j = 1,...,4, as follows: 7r := (7rll', 7r14, 7r2b, 7r24, 7r31,, 7r34, 7r 41,, 7r 44)' Step 2 needs a little more work. The representation of, as a compounded function described in (2) is of the following form:, = exp (A4 [In (A3 [exp (A2 [In (Al7r ))))))) where A1 =

4 A2 = 1 0 o " , n Aa= ( -1-1, ' A4 = ( 1-1 ) The other steps (3) - (5) can be seen in the listing of the SAS-program and output in the following sections. 4. SAS-Program of the Practical Example appa statistic for interobserver agr~ement 4 response categories of the variable 'loss of needles': 1 less than 5% 2 = 5% 15% 3 15% 25% 4 = more than 25% 2 independent observers Data source: rahl-urban et al. (1988) ; title 'Measurement of interobserver agreement'; data fds; I input of the 4*4 contingency table I i input ob1 ob2 cards; \ \ 719

5 * ~* calculation of the measure of association here: kappa-coefficient (see: Cohen, 1960» ; proc catmod data=fds; response statement to specify series of transformations describing the kappa-coefficient ; response exp 1-1 log , exp , , , , , log , , , , , , , () , , ; weight freq; degenerated 'dummy' model statement to use PROC CATMOD as a procedure for estimating measures of association and their asymptotic variances rather than for usual modeling ; model ob1 * ob2 = (1) / nodesign noprofile covb;.. run; 720

6 5. SAS-Output of the Practical Example I!. Measurement of interobserver agreement CATMOD PROCEDURE Response: OB1*OB2 Weight variable: FREQ Data Set: FDS Response Levels (R)= Populations (S)= Total FreqUency (N)= Observations (Obs)= Source MODELl MEAN RESIDUAL ANALYSIS OF VARIANCE TABLE DF Chi-Square Prob NOTE: Effects marked with * contained 1 or more singularities (i.e., redundant parameters). 0* o ANALYSIS OF WEI~HTED-LEAST-SQUARES ESTIMATES Effect Parameter Estimate Standard Error Chi Square Prob MODEL COVARIANCE MATRIX OF THE PARAMETER ESTIMATES i \.. 721

7 6. Discussion In different fields of statistical application like social sciences, psychology, biology, and epidemiology a huge number of specific measures of association has been proposed. Producers of statistical software packages like SAS fight a loosing battle in trying to extend their systems to cover all measures of association proposed in specific applications, because the variety of ways to describe the relationship between variables with regard to some specific feature of the association seems to be unlimited. Each year some new measures of association are added to this multitude, and there is no end of this development in sight. Whereas, in general, the calculation of the estimator of the measure of association constitutes no problem, the asymptotic variance of the estimator is not easy to procurecompu:tationally. In this paper we have shown how to use PROC CATMOD of the SAS/STAT* software to solve the computational problems. The advantage of this new approach lies in a substantial reduction of the computational effort for the user. The cumbersome calculation of the asymptotic variance is completely undertaken by the program. The only restriction of our method results from the distributional assumption implicitely employed when using the GS-methodology. Therefore, e.g. data of contingency tables arising from the hypergeometrical sampling model (i.e. all marginal distributions are fixed prior to sampling) cannot be analysed in this framework. But for all situations of the multinomial sampling model our approach provides a flexible and convenient way of estimating measures of association and their asymptotic variances. References Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psych. Meas. 20, Forthofer, R.N. and och, G.G. (1973). An analysis for compounded functions of categorical data. Biometrics 29, Gefeller, O. and WoItering, F. (1991). A general method of estimating measures of association and their asymptotic variances under the multinomial model using standard SAS software. Computat. Statist.. Data Analysis (submitted). Grizzle, J.E., Starmer, C.F. and och, G.G. (1969). Analysis of categorical data by linear models. Biometrics 25, och, G.G., Landis, J.R., Freeman, J.L., Freeman, D.H. and Lehnen, R.G. (19'77). A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics 33, rahl-urban, B., Papke, H.E., Peters,. and Schimansky, C. (1988). Forest decline. Cause-effect research in the United States of North America and Federal Republic of Germany, Jiilich. Landis, J.R. and och, G.G. (1977). The measurement of observer agreement for categorical data. Biometrics 33, Semenya,.A., och, G.G., Stokes, M.E. and Forthofer, R.N. (1983). Linear models methods for some rank functions analyses of ordinal categorical data. Commun. Statistics - Theory Meth. 12, Williams, O.D. and Grizzle, J.E. (1972). Analysis of contingency tables having ordered response categories. JASA 67, SAS/STAT is a registered trademark of SAS Institute Inc., Cary, NC, USA. 722

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These

More information

BIOMETRICS INFORMATION

BIOMETRICS INFORMATION BIOMETRICS INFORMATION Index of Pamphlet Topics (for pamphlets #1 to #60) as of December, 2000 Adjusted R-square ANCOVA: Analysis of Covariance 13: ANCOVA: Analysis of Covariance ANOVA: Analysis of Variance

More information

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM

Application of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''.

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Analysis of Survival Data Using Cox Model (Continuous Type)

Analysis of Survival Data Using Cox Model (Continuous Type) Australian Journal of Basic and Alied Sciences, 7(0): 60-607, 03 ISSN 99-878 Analysis of Survival Data Using Cox Model (Continuous Type) Khawla Mustafa Sadiq Department of Mathematics, Education College,

More information

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description COURSE COURSE TITLE UNITS NO. OF HOURS PREREQUISITES DESCRIPTION Elementary Statistics STATISTICS 3 1,2,s

More information

Models for Binary Outcomes

Models for Binary Outcomes Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

More information

Lecture 25: Models for Matched Pairs

Lecture 25: Models for Matched Pairs Lecture 25: Models for Matched Pairs Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture

More information

The concord Package. August 20, 2006

The concord Package. August 20, 2006 The concord Package August 20, 2006 Version 1.4-6 Date 2006-08-15 Title Concordance and reliability Author , Ian Fellows Maintainer Measures

More information

Ordinal Variables in 2 way Tables

Ordinal Variables in 2 way Tables Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables

More information

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

A SAS/AF Application For Sample Size And Power Determination

A SAS/AF Application For Sample Size And Power Determination A SAS/AF Application For Sample Size And Power Determination Fiona Portwood, Software Product Services Ltd. Abstract When planning a study, such as a clinical trial or toxicology experiment, the choice

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Robert Aa Terry, Th~ University of North Carolina at Chapel Hill

Robert Aa Terry, Th~ University of North Carolina at Chapel Hill enerating Kappa Statistics and esting Useful Hypothl!S4lS "i th POC Cl'IDD obert a erry, h~ University of North Carolina at Chapel Hill ""dels bstract his paper demonstrates how POC Cl OD may be used in

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

Statistics and Data Analysis

Statistics and Data Analysis Redesigning Experiments With Polychotomous Logistic Regression: A Power Computation Application Charles Vaughan, Xoma US, LLC, Berkeley, CA Serge Guzy, Xoma US, LLC, Berkeley, CA ABSTRACT Power and sample

More information

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Sections 3.4, 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 3.4 I J tables with ordinal outcomes Tests that take advantage of ordinal

More information

Textbook Examples of. SPSS Procedure

Textbook Examples of. SPSS Procedure Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of

More information

Chapter 19. Agreement and the kappa statistic

Chapter 19. Agreement and the kappa statistic 19. Agreement Chapter 19 Agreement and the kappa statistic Besides the 2 2contingency table for unmatched data and the 2 2table for matched data, there is a third common occurrence of data appearing summarised

More information

Approximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions

Approximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions Approximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions Lei Zhang 1, Hongmei Han 2, Dachuan Zhang 3, and William D. Johnson 2 1. Mississippi State Department of Health,

More information

SAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 13.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.2 User s Guide. The correct bibliographic citation for the complete

More information

GENERALIZED LINEAR MIXED MODELS: AN APPLICATION

GENERALIZED LINEAR MIXED MODELS: AN APPLICATION Libraries Conference on Applied Statistics in Agriculture 1994-6th Annual Conference Proceedings GENERALIZED LINEAR MIXED MODELS: AN APPLICATION Stephen D. Kachman Walter W. Stroup Follow this and additional

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

SESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01

SESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01 Paper CC-01 Smoothing Scaled Score Distributions from a Standardized Test using PROC GENMOD Jonathan Steinberg, Educational Testing Service, Princeton, NJ Tim Moses, Educational Testing Service, Princeton,

More information

Ayfer E. Yilmaz 1*, Serpil Aktas 2. Abstract

Ayfer E. Yilmaz 1*, Serpil Aktas 2. Abstract 89 Kuwait J. Sci. Ridit 45 (1) and pp exponential 89-99, 2018type scores for estimating the kappa statistic Ayfer E. Yilmaz 1*, Serpil Aktas 2 1 Dept. of Statistics, Faculty of Science, Hacettepe University,

More information

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 14.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual

More information

SAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 13.1 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete

More information

SAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis

SAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis SAS/STAT 14.1 User s Guide Introduction to Nonparametric Analysis This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows:

More information

BIOMETRICS INFORMATION

BIOMETRICS INFORMATION BIOMETRICS INFORMATION (You re 95% likely to need this information) PAMPHLET NO. # 41 DATE: September 18, 1992 SUBJECT: Power Analysis and Sample Size Determination for Contingency Table Tests Statistical

More information

Some comments on Partitioning

Some comments on Partitioning Some comments on Partitioning Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/30 Partitioning Chi-Squares We have developed tests

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Topic 21 Goodness of Fit

Topic 21 Goodness of Fit Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

SAS/STAT 14.2 User s Guide. Introduction to Analysis of Variance Procedures

SAS/STAT 14.2 User s Guide. Introduction to Analysis of Variance Procedures SAS/STAT 14.2 User s Guide Introduction to Analysis of Variance Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual is

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Bivariate Relationships Between Variables

Bivariate Relationships Between Variables Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38 BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

SAS/STAT 15.1 User s Guide The GLMMOD Procedure

SAS/STAT 15.1 User s Guide The GLMMOD Procedure SAS/STAT 15.1 User s Guide The GLMMOD Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.

More information

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Mixed Models for Longitudinal Ordinal and Nominal Outcomes Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

MULTINOMIAL LOGISTIC REGRESSION

MULTINOMIAL LOGISTIC REGRESSION MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

SAS/STAT 13.1 User s Guide. The Four Types of Estimable Functions

SAS/STAT 13.1 User s Guide. The Four Types of Estimable Functions SAS/STAT 13.1 User s Guide The Four Types of Estimable Functions This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as

More information

2 Describing Contingency Tables

2 Describing Contingency Tables 2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random

More information

Chapter 31 The GLMMOD Procedure. Chapter Table of Contents

Chapter 31 The GLMMOD Procedure. Chapter Table of Contents Chapter 31 The GLMMOD Procedure Chapter Table of Contents OVERVIEW...1639 GETTING STARTED...1639 AOne-WayDesign...1639 SYNTAX...1644 PROCGLMMODStatement...1644 BYStatement...1646 CLASSStatement...1646

More information

Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data

Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data Journal of Data Science 9(2011), 43-54 Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data Haydar Demirhan Hacettepe University

More information

Logistic Regression Analysis

Logistic Regression Analysis Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well

More information

Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions. Alan J Xiao, Cognigen Corporation, Buffalo NY

Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions. Alan J Xiao, Cognigen Corporation, Buffalo NY Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions Alan J Xiao, Cognigen Corporation, Buffalo NY ABSTRACT Logistic regression has been widely applied to population

More information

DISPLAYING THE POISSON REGRESSION ANALYSIS

DISPLAYING THE POISSON REGRESSION ANALYSIS Chapter 17 Poisson Regression Chapter Table of Contents DISPLAYING THE POISSON REGRESSION ANALYSIS...264 ModelInformation...269 SummaryofFit...269 AnalysisofDeviance...269 TypeIII(Wald)Tests...269 MODIFYING

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next Book Contents Previous Next SAS/STAT User's Guide Overview Getting Started Syntax Details Examples References Book Contents Previous Next Top http://v8doc.sas.com/sashtml/stat/chap29/index.htm29/10/2004

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

Interactions among Continuous Predictors

Interactions among Continuous Predictors Interactions among Continuous Predictors Today s Class: Simple main effects within two-way interactions Conquering TEST/ESTIMATE/LINCOM statements Regions of significance Three-way interactions (and beyond

More information

Stat 587: Key points and formulae Week 15

Stat 587: Key points and formulae Week 15 Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place

More information

Assessing agreement with multiple raters on correlated kappa statistics

Assessing agreement with multiple raters on correlated kappa statistics Biometrical Journal 52 (2010) 61, zzz zzz / DOI: 10.1002/bimj.200100000 Assessing agreement with multiple raters on correlated kappa statistics Hongyuan Cao,1, Pranab K. Sen 2, Anne F. Peery 3, and Evan

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

The Spectrum of Broadway: A SAS

The Spectrum of Broadway: A SAS The Spectrum of Broadway: A SAS PROC SPECTRA Inquiry James D. Ryan and Joseph Earley Emporia State University and Loyola Marymount University Abstract This paper describes how to use the sophisticated

More information

Types of Statistical Tests DR. MIKE MARRAPODI

Types of Statistical Tests DR. MIKE MARRAPODI Types of Statistical Tests DR. MIKE MARRAPODI Tests t tests ANOVA Correlation Regression Multivariate Techniques Non-parametric t tests One sample t test Independent t test Paired sample t test One sample

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

Simultaneous Confidence Intervals and Multiple Contrast Tests

Simultaneous Confidence Intervals and Multiple Contrast Tests Simultaneous Confidence Intervals and Multiple Contrast Tests Edgar Brunner Abteilung Medizinische Statistik Universität Göttingen 1 Contents Parametric Methods Motivating Example SCI Method Analysis of

More information

Special Topics. Handout #4. Diagnostics. Residual Analysis. Nonlinearity

Special Topics. Handout #4. Diagnostics. Residual Analysis. Nonlinearity Special Topics Diagnostics Residual Analysis As with linear regression, an analysis of residuals is necessary to assess the model adequacy. The same techniques may be employed such as plotting residuals

More information

Logistic Regression. Continued Psy 524 Ainsworth

Logistic Regression. Continued Psy 524 Ainsworth Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response) Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

ON INFERENCE FROM GENERAL CATEGORICAL DATA WITH MISCLASSIFICATION ERRORS BASED ON DOUBLE SAMPLING SCHEMES. Yosef Hochberg

ON INFERENCE FROM GENERAL CATEGORICAL DATA WITH MISCLASSIFICATION ERRORS BASED ON DOUBLE SAMPLING SCHEMES. Yosef Hochberg ~.e ON INFERENCE FROM GENERAL CATEGORICAL DATA WITH MISCLASSIFICATION ERRORS BASED ON DOUBLE SAMPLING SCHEMES by Yosef Hochberg Department of Bios~atistics University of North Carolina at Chapel Hill Institute

More information

CDA Chapter 3 part II

CDA Chapter 3 part II CDA Chapter 3 part II Two-way tables with ordered classfications Let u 1 u 2... u I denote scores for the row variable X, and let ν 1 ν 2... ν J denote column Y scores. Consider the hypothesis H 0 : X

More information

The Function Selection Procedure

The Function Selection Procedure ABSTRACT Paper 2390-2018 The Function Selection Procedure Bruce Lund, Magnify Analytic Solutions, a Division of Marketing Associates, LLC The function selection procedure (FSP) finds a very good transformation

More information

ABSTRACT KEYWORDS 1. INTRODUCTION

ABSTRACT KEYWORDS 1. INTRODUCTION THE SAMPLE SIZE NEEDED FOR THE CALCULATION OF A GLM TARIFF BY HANS SCHMITTER ABSTRACT A simple upper bound for the variance of the frequency estimates in a multivariate tariff using class criteria is deduced.

More information

LOOKING FOR RELATIONSHIPS

LOOKING FOR RELATIONSHIPS LOOKING FOR RELATIONSHIPS One of most common types of investigation we do is to look for relationships between variables. Variables may be nominal (categorical), for example looking at the effect of an

More information

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Paper S02-2007 RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Eli Lilly & Company, Indianapolis, IN ABSTRACT

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Negative Multinomial Model and Cancer. Incidence

Negative Multinomial Model and Cancer. Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,

More information

Measuring relationships among multiple responses

Measuring relationships among multiple responses Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.

More information

USE OF THE SAS VARCOMP PROCEDURE TO ESTIMATE ANALYTICAL REPEATABILITY. Anna Caroli Istituto di Zootecnica Veterinaria - Milano - Italy

USE OF THE SAS VARCOMP PROCEDURE TO ESTIMATE ANALYTICAL REPEATABILITY. Anna Caroli Istituto di Zootecnica Veterinaria - Milano - Italy INTRODUCTION USE OF THE SAS VARCOMP PROCEDURE TO ESTIMATE ANALYTICAL REPEATABILITY Anna Caroli Istituto di Zootecnica Veterinaria - Milano - Italy Researchers often have to assess if an analytical method

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS

COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS American Journal of Biostatistics 4 (2): 29-33, 2014 ISSN: 1948-9889 2014 A.H. Al-Marshadi, This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/ajbssp.2014.29.33

More information

Selection and Transformation of Continuous Predictors for Logistic Regression

Selection and Transformation of Continuous Predictors for Logistic Regression Paper AA-09-2014 Selection and Transformation of Continuous Predictors for Logistic Regression ABSTRACT Bruce Lund, Magnify Analytic Solutions A Division of Marketing Associates, Detroit, MI This paper

More information