HOW TO USE PROC CATMOD IN ESTIMATION PROBLEMS
|
|
- Cori James
- 6 years ago
- Views:
Transcription
1 , HOW TO USE PROC CATMOD IN ESTIMATION PROBLEMS Olaf Gefeller 1, Franz Woltering2 1 Abteilung Medizinische Statistik, Georg-August-Universitat Gottingen 2Fachbereich Statistik, Universitat Dortmund Abstract The paper describes a new application of the statistical analysis procedure PROC CATMOD. It demonstrates how PROC CATMOD can be easily adapted to estimation problems in contingency tables. In particular, it is shown how estimators of measures of association and their asymptotic variances can be calculated using PROC CATMOD. The practical application of this approach is illustrated for the kappa-coefficient. Real data of the German Forest Decline Survey are provided in the practical example. eywords: GS-model, kappa-coefficient, measures of association, variance calculation 1. Introduction PROC CATMOD, one of the statistical analysis procedures within the SAS/STAT* software, is originally designed in the context of the Grizzle, Starmer & och (1969) (subsequently abbreviated GS) approach to fit linear models to functions of response frequencies in contingency tables as used in linear modeling, log-linear modeling, logistic regression, and repeated measurement analysis. The GS-approach is typically applied to the analysis of contingency tables with ordered responses (Williams & Grizzle, 1972), measurement of observer agreement (Landis & och, 1977), the analysis of repeated mesurement experiments (och et al., 1977), and rank function analysis (Semenya et al., 1983). In all these applications PROC CATMOD has been successfully employed to solve the computational problems of the analysis. It represents a complex but powerful tool, which needs some experience to handle (and patience to struggle through more than 100 pages of documentation in the SAS/STAT* manual). In this paper we demonstrate that it can be easily adapted to estimation problems in contingency tables. For a broad class of estimators of measures of association under the multinomial sampling model we show how to use PROC CATMOD to calculate the estimator of the measure of association and - more importantly - the asymptotic variance of the estimator. The statistical background of our method, the link between the GS-approach and the estimation of measures of association, will not be presented here (see Gefeller & Woltering, 1991). To illustrate the practical application of the procedure the well-known kappa-coefficient, a measure of agreement between two different ratings, is used, and data of the German Forest Decline Survey (rahl-urban et al., 1988) are presented. Further general remarks on the proposed method are provided in the concluding section. 716
2 2. Methods Adapting the GS-approach to estimation problems in contingency tables, complex ratio statistics such as measures of association have to be written as special functions of the probability estimates of the underlying product multinomial model. To generate estimators of this type, compounded functions involving only linear, logarithmic, and exponential transformations (Forthofer & och, 1973) of the general form F(p) =... As [exp (~ ~n (A3 [exp (A2 ~~ (AlP)])])])], where Ai denotes a matrix with constants, are employed. This general framework offers the opportunity to compute complex estimators in which probabilities from different subpopuiatiolls are Combined~ In the special situation of a single multinomial population achieved by unrestricted sampling of elements measures of association can be expressed in the same as illustrated above. The advantage derived in this situation lies in the chance of using standard software for GS-models such as PROC CATMOD to calculate the estimators and their asymptotic variances. The device consists of fitting the simple linear model F(p) = X{3, where F(p) is the I-dimensional response function as specified above, X denotes the degenerated 1 x 1 matrix consisting of the constant' 1', and {3 represents the I-dimensional parameter. Then the estimator b of (3 equals the response function F itself and the variance of b is given by v" = VF. This analysis is possible in PROC CATMOD by directly specifying the design matix on the MODEL statement and by using the RESPONSE statement to describe a series of transformations to the probability estimates in order to produce F(p), the function of interest. At first glance this procedure seems to increase the computational effort by introducing the tedious work of constructing a series of transformations to describe the measure of association. But, in fact, the most annoying part of the computational task, the calculation of the asymptotic variance through computation of the first derivative matrix and of additional matrix products, is completely undertaken by the computer program. Thus the computational effort for the user is reduced substantially. 3. Practical Example To illustrate the pra~tical application of the 'method the well-known kappa-coeffi~ient (Cohen, 1960)) is used. Cohen's kappa constitutes a popular measure of agreement between two different ratings. It is defined for quadratic x contingency tables. Using the usual row-column parametrization of cell probabilities in contingency tables, which will be denoted as 7rij, i,j = 1,...,, the kappa-coefficient is defined as,:= 2: 7rii - 2: 7ri.7r.i i=l i=l 1-2: 7ri.7r.i i=l The term 2: 7ri.7r.i represents the expected value of agreement,under the hypothesis of independent ratings. i=l ' ' Procedures for the estimation of the kappa-coefficient and its asymptotic variance are not available in standard statistical sqftware packages. To use the SAS/STAT* procedure CATMOD to do the calculations in the way outlined above, the following steps have to be applied:, \ " \,,; 717
3 (1) specify, as a function of a suitable vector of probabilities 7rij (2) transform, to a compounded function involving only linear, logarithmic and exponential operations (3) set up the 'dummy' MODEL statement consisting of the constant '1' as the design matrix (4) set up the RESPONSE statement using the transformation constructed in (2) (5) run PROC CATMOD and look for the 'Analysis of weighted-least-squares estimates'-table in the output, where the estimated value of, and its asymptotic standard error appears (in addition, the estimated asymptotic variance of, can be obtained directly by specifying the 'COVB'-option on the MODEL statement) As a numerical example we use data from the Forest Decline Survey (rahl-urban et al., 1988)). The data based on the variable 'loss of needles' (in percent), which has been categorized independently by two observers into four groups according to the severity of damage, are presented in the following 4 x 4 contingency table: L: L: Now, step 1 involves only the definition that the vector 7r E R16 is build up using the row-column cell probabilities 7rij, i, j = 1,...,4, as follows: 7r := (7rll', 7r14, 7r2b, 7r24, 7r31,, 7r34, 7r 41,, 7r 44)' Step 2 needs a little more work. The representation of, as a compounded function described in (2) is of the following form:, = exp (A4 [In (A3 [exp (A2 [In (Al7r ))))))) where A1 =
4 A2 = 1 0 o " , n Aa= ( -1-1, ' A4 = ( 1-1 ) The other steps (3) - (5) can be seen in the listing of the SAS-program and output in the following sections. 4. SAS-Program of the Practical Example appa statistic for interobserver agr~ement 4 response categories of the variable 'loss of needles': 1 less than 5% 2 = 5% 15% 3 15% 25% 4 = more than 25% 2 independent observers Data source: rahl-urban et al. (1988) ; title 'Measurement of interobserver agreement'; data fds; I input of the 4*4 contingency table I i input ob1 ob2 cards; \ \ 719
5 * ~* calculation of the measure of association here: kappa-coefficient (see: Cohen, 1960» ; proc catmod data=fds; response statement to specify series of transformations describing the kappa-coefficient ; response exp 1-1 log , exp , , , , , log , , , , , , , () , , ; weight freq; degenerated 'dummy' model statement to use PROC CATMOD as a procedure for estimating measures of association and their asymptotic variances rather than for usual modeling ; model ob1 * ob2 = (1) / nodesign noprofile covb;.. run; 720
6 5. SAS-Output of the Practical Example I!. Measurement of interobserver agreement CATMOD PROCEDURE Response: OB1*OB2 Weight variable: FREQ Data Set: FDS Response Levels (R)= Populations (S)= Total FreqUency (N)= Observations (Obs)= Source MODELl MEAN RESIDUAL ANALYSIS OF VARIANCE TABLE DF Chi-Square Prob NOTE: Effects marked with * contained 1 or more singularities (i.e., redundant parameters). 0* o ANALYSIS OF WEI~HTED-LEAST-SQUARES ESTIMATES Effect Parameter Estimate Standard Error Chi Square Prob MODEL COVARIANCE MATRIX OF THE PARAMETER ESTIMATES i \.. 721
7 6. Discussion In different fields of statistical application like social sciences, psychology, biology, and epidemiology a huge number of specific measures of association has been proposed. Producers of statistical software packages like SAS fight a loosing battle in trying to extend their systems to cover all measures of association proposed in specific applications, because the variety of ways to describe the relationship between variables with regard to some specific feature of the association seems to be unlimited. Each year some new measures of association are added to this multitude, and there is no end of this development in sight. Whereas, in general, the calculation of the estimator of the measure of association constitutes no problem, the asymptotic variance of the estimator is not easy to procurecompu:tationally. In this paper we have shown how to use PROC CATMOD of the SAS/STAT* software to solve the computational problems. The advantage of this new approach lies in a substantial reduction of the computational effort for the user. The cumbersome calculation of the asymptotic variance is completely undertaken by the program. The only restriction of our method results from the distributional assumption implicitely employed when using the GS-methodology. Therefore, e.g. data of contingency tables arising from the hypergeometrical sampling model (i.e. all marginal distributions are fixed prior to sampling) cannot be analysed in this framework. But for all situations of the multinomial sampling model our approach provides a flexible and convenient way of estimating measures of association and their asymptotic variances. References Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psych. Meas. 20, Forthofer, R.N. and och, G.G. (1973). An analysis for compounded functions of categorical data. Biometrics 29, Gefeller, O. and WoItering, F. (1991). A general method of estimating measures of association and their asymptotic variances under the multinomial model using standard SAS software. Computat. Statist.. Data Analysis (submitted). Grizzle, J.E., Starmer, C.F. and och, G.G. (1969). Analysis of categorical data by linear models. Biometrics 25, och, G.G., Landis, J.R., Freeman, J.L., Freeman, D.H. and Lehnen, R.G. (19'77). A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics 33, rahl-urban, B., Papke, H.E., Peters,. and Schimansky, C. (1988). Forest decline. Cause-effect research in the United States of North America and Federal Republic of Germany, Jiilich. Landis, J.R. and och, G.G. (1977). The measurement of observer agreement for categorical data. Biometrics 33, Semenya,.A., och, G.G., Stokes, M.E. and Forthofer, R.N. (1983). Linear models methods for some rank functions analyses of ordinal categorical data. Commun. Statistics - Theory Meth. 12, Williams, O.D. and Grizzle, J.E. (1972). Analysis of contingency tables having ordered response categories. JASA 67, SAS/STAT is a registered trademark of SAS Institute Inc., Cary, NC, USA. 722
ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS
Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference
More informationij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as
page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These
More informationBIOMETRICS INFORMATION
BIOMETRICS INFORMATION Index of Pamphlet Topics (for pamphlets #1 to #60) as of December, 2000 Adjusted R-square ANCOVA: Analysis of Covariance 13: ANCOVA: Analysis of Covariance ANOVA: Analysis of Variance
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationINFORMATION THEORY AND STATISTICS
INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''.
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationAnalysis of Survival Data Using Cox Model (Continuous Type)
Australian Journal of Basic and Alied Sciences, 7(0): 60-607, 03 ISSN 99-878 Analysis of Survival Data Using Cox Model (Continuous Type) Khawla Mustafa Sadiq Department of Mathematics, Education College,
More informationUNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description
UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description COURSE COURSE TITLE UNITS NO. OF HOURS PREREQUISITES DESCRIPTION Elementary Statistics STATISTICS 3 1,2,s
More informationModels for Binary Outcomes
Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.
More informationLecture 25: Models for Matched Pairs
Lecture 25: Models for Matched Pairs Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture
More informationThe concord Package. August 20, 2006
The concord Package August 20, 2006 Version 1.4-6 Date 2006-08-15 Title Concordance and reliability Author , Ian Fellows Maintainer Measures
More informationOrdinal Variables in 2 way Tables
Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationA SAS/AF Application For Sample Size And Power Determination
A SAS/AF Application For Sample Size And Power Determination Fiona Portwood, Software Product Services Ltd. Abstract When planning a study, such as a clinical trial or toxicology experiment, the choice
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationRobert Aa Terry, Th~ University of North Carolina at Chapel Hill
enerating Kappa Statistics and esting Useful Hypothl!S4lS "i th POC Cl'IDD obert a erry, h~ University of North Carolina at Chapel Hill ""dels bstract his paper demonstrates how POC Cl OD may be used in
More informationChapter 1. Modeling Basics
Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical
More informationStatistics and Data Analysis
Redesigning Experiments With Polychotomous Logistic Regression: A Power Computation Application Charles Vaughan, Xoma US, LLC, Berkeley, CA Serge Guzy, Xoma US, LLC, Berkeley, CA ABSTRACT Power and sample
More informationSections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Sections 3.4, 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 3.4 I J tables with ordinal outcomes Tests that take advantage of ordinal
More informationTextbook Examples of. SPSS Procedure
Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of
More informationChapter 19. Agreement and the kappa statistic
19. Agreement Chapter 19 Agreement and the kappa statistic Besides the 2 2contingency table for unmatched data and the 2 2table for matched data, there is a third common occurrence of data appearing summarised
More informationApproximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions
Approximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions Lei Zhang 1, Hongmei Han 2, Dachuan Zhang 3, and William D. Johnson 2 1. Mississippi State Department of Health,
More informationSAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 13.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.2 User s Guide. The correct bibliographic citation for the complete
More informationGENERALIZED LINEAR MIXED MODELS: AN APPLICATION
Libraries Conference on Applied Statistics in Agriculture 1994-6th Annual Conference Proceedings GENERALIZED LINEAR MIXED MODELS: AN APPLICATION Stephen D. Kachman Walter W. Stroup Follow this and additional
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationSESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01
Paper CC-01 Smoothing Scaled Score Distributions from a Standardized Test using PROC GENMOD Jonathan Steinberg, Educational Testing Service, Princeton, NJ Tim Moses, Educational Testing Service, Princeton,
More informationAyfer E. Yilmaz 1*, Serpil Aktas 2. Abstract
89 Kuwait J. Sci. Ridit 45 (1) and pp exponential 89-99, 2018type scores for estimating the kappa statistic Ayfer E. Yilmaz 1*, Serpil Aktas 2 1 Dept. of Statistics, Faculty of Science, Hacettepe University,
More informationSAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 14.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual
More informationSAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 13.1 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete
More informationSAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis
SAS/STAT 14.1 User s Guide Introduction to Nonparametric Analysis This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows:
More informationBIOMETRICS INFORMATION
BIOMETRICS INFORMATION (You re 95% likely to need this information) PAMPHLET NO. # 41 DATE: September 18, 1992 SUBJECT: Power Analysis and Sample Size Determination for Contingency Table Tests Statistical
More informationSome comments on Partitioning
Some comments on Partitioning Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/30 Partitioning Chi-Squares We have developed tests
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationOne-stage dose-response meta-analysis
One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More informationMARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES
REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of
More informationSAS/STAT 14.2 User s Guide. Introduction to Analysis of Variance Procedures
SAS/STAT 14.2 User s Guide Introduction to Analysis of Variance Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual is
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38
BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationSAS/STAT 15.1 User s Guide The GLMMOD Procedure
SAS/STAT 15.1 User s Guide The GLMMOD Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More informationMixed Models for Longitudinal Ordinal and Nominal Outcomes
Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More informationMULTINOMIAL LOGISTIC REGRESSION
MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationGeneralized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey
More informationA note on R 2 measures for Poisson and logistic regression models when both models are applicable
Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationSAS/STAT 13.1 User s Guide. The Four Types of Estimable Functions
SAS/STAT 13.1 User s Guide The Four Types of Estimable Functions This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as
More information2 Describing Contingency Tables
2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random
More informationChapter 31 The GLMMOD Procedure. Chapter Table of Contents
Chapter 31 The GLMMOD Procedure Chapter Table of Contents OVERVIEW...1639 GETTING STARTED...1639 AOne-WayDesign...1639 SYNTAX...1644 PROCGLMMODStatement...1644 BYStatement...1646 CLASSStatement...1646
More informationLatent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data
Journal of Data Science 9(2011), 43-54 Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data Haydar Demirhan Hacettepe University
More informationLogistic Regression Analysis
Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well
More informationPractice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions. Alan J Xiao, Cognigen Corporation, Buffalo NY
Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions Alan J Xiao, Cognigen Corporation, Buffalo NY ABSTRACT Logistic regression has been widely applied to population
More informationDISPLAYING THE POISSON REGRESSION ANALYSIS
Chapter 17 Poisson Regression Chapter Table of Contents DISPLAYING THE POISSON REGRESSION ANALYSIS...264 ModelInformation...269 SummaryofFit...269 AnalysisofDeviance...269 TypeIII(Wald)Tests...269 MODIFYING
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationThe GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next
Book Contents Previous Next SAS/STAT User's Guide Overview Getting Started Syntax Details Examples References Book Contents Previous Next Top http://v8doc.sas.com/sashtml/stat/chap29/index.htm29/10/2004
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationFrom Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...
From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...
More informationInteractions among Continuous Predictors
Interactions among Continuous Predictors Today s Class: Simple main effects within two-way interactions Conquering TEST/ESTIMATE/LINCOM statements Regions of significance Three-way interactions (and beyond
More informationStat 587: Key points and formulae Week 15
Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place
More informationAssessing agreement with multiple raters on correlated kappa statistics
Biometrical Journal 52 (2010) 61, zzz zzz / DOI: 10.1002/bimj.200100000 Assessing agreement with multiple raters on correlated kappa statistics Hongyuan Cao,1, Pranab K. Sen 2, Anne F. Peery 3, and Evan
More informationAnalysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationThe Spectrum of Broadway: A SAS
The Spectrum of Broadway: A SAS PROC SPECTRA Inquiry James D. Ryan and Joseph Earley Emporia State University and Loyola Marymount University Abstract This paper describes how to use the sophisticated
More informationTypes of Statistical Tests DR. MIKE MARRAPODI
Types of Statistical Tests DR. MIKE MARRAPODI Tests t tests ANOVA Correlation Regression Multivariate Techniques Non-parametric t tests One sample t test Independent t test Paired sample t test One sample
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationContents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1
Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services
More informationSimultaneous Confidence Intervals and Multiple Contrast Tests
Simultaneous Confidence Intervals and Multiple Contrast Tests Edgar Brunner Abteilung Medizinische Statistik Universität Göttingen 1 Contents Parametric Methods Motivating Example SCI Method Analysis of
More informationSpecial Topics. Handout #4. Diagnostics. Residual Analysis. Nonlinearity
Special Topics Diagnostics Residual Analysis As with linear regression, an analysis of residuals is necessary to assess the model adequacy. The same techniques may be employed such as plotting residuals
More informationLogistic Regression. Continued Psy 524 Ainsworth
Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression
More informationModel Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)
Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationON INFERENCE FROM GENERAL CATEGORICAL DATA WITH MISCLASSIFICATION ERRORS BASED ON DOUBLE SAMPLING SCHEMES. Yosef Hochberg
~.e ON INFERENCE FROM GENERAL CATEGORICAL DATA WITH MISCLASSIFICATION ERRORS BASED ON DOUBLE SAMPLING SCHEMES by Yosef Hochberg Department of Bios~atistics University of North Carolina at Chapel Hill Institute
More informationCDA Chapter 3 part II
CDA Chapter 3 part II Two-way tables with ordered classfications Let u 1 u 2... u I denote scores for the row variable X, and let ν 1 ν 2... ν J denote column Y scores. Consider the hypothesis H 0 : X
More informationThe Function Selection Procedure
ABSTRACT Paper 2390-2018 The Function Selection Procedure Bruce Lund, Magnify Analytic Solutions, a Division of Marketing Associates, LLC The function selection procedure (FSP) finds a very good transformation
More informationABSTRACT KEYWORDS 1. INTRODUCTION
THE SAMPLE SIZE NEEDED FOR THE CALCULATION OF A GLM TARIFF BY HANS SCHMITTER ABSTRACT A simple upper bound for the variance of the frequency estimates in a multivariate tariff using class criteria is deduced.
More informationLOOKING FOR RELATIONSHIPS
LOOKING FOR RELATIONSHIPS One of most common types of investigation we do is to look for relationships between variables. Variables may be nominal (categorical), for example looking at the effect of an
More informationRANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar
Paper S02-2007 RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Eli Lilly & Company, Indianapolis, IN ABSTRACT
More informationCohen s s Kappa and Log-linear Models
Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationNegative Multinomial Model and Cancer. Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.
More informationUSE OF THE SAS VARCOMP PROCEDURE TO ESTIMATE ANALYTICAL REPEATABILITY. Anna Caroli Istituto di Zootecnica Veterinaria - Milano - Italy
INTRODUCTION USE OF THE SAS VARCOMP PROCEDURE TO ESTIMATE ANALYTICAL REPEATABILITY Anna Caroli Istituto di Zootecnica Veterinaria - Milano - Italy Researchers often have to assess if an analytical method
More information6 Single Sample Methods for a Location Parameter
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationCOLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS
American Journal of Biostatistics 4 (2): 29-33, 2014 ISSN: 1948-9889 2014 A.H. Al-Marshadi, This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/ajbssp.2014.29.33
More informationSelection and Transformation of Continuous Predictors for Logistic Regression
Paper AA-09-2014 Selection and Transformation of Continuous Predictors for Logistic Regression ABSTRACT Bruce Lund, Magnify Analytic Solutions A Division of Marketing Associates, Detroit, MI This paper
More information