Correspondence Analysis
|
|
- Katherine Nash
- 5 years ago
- Views:
Transcription
1 Correspondence Analysis Q: when independence of a 2-way contingency table is rejected, how to know where the dependence is coming from? The interaction terms in a GLM contain dependence information; however, interpretation of interactions could be difficult Correspondence analysis: a visual residual analysis for contingency table Singular value decomposition R: an r c matrix. W.l.o.g, assume r c and rank(r) c, then R UDV T, where U: an r c column orthonormal matrix, i.e., U T U I c c ; its columns are called left singular vectors V: a c c column orthonormal matrix, i.e., V T V I c c ; its columns called right singular vectors D diag(d 1,, d c ), where d 1 d 2 d c, called singular values p Some properties p Columns of U r c are eigenvectors of (RR T ) r r Columns of V c c are eigenvectors of (R T R) c c {d 12,, d c2 } are eigenvalues of RR T and R T R Procedure of correspondence analysis on Pearson residuals a)fit a GLM corresponding to independence on the contingency table and compute its Pearson residuals, r p s (Q: what information contained in the r p s?) b)write r p s in the matrix form [R ij ] R r c as in contingency table c)perform the singular value decomposition on R: R UDV T R ij = P c k=1 U ikd k V jk d)it is not uncommon for the first few singular values of R to be much larger than the rest. Suppose that the first 2 dominate. Then, R ij U i1 d 1 V j1 + U i2 d 2 V j2 p p p p = ³U i1 d1 ³V j1 d1 + ³U i2 d2 ³V j2 d2 U i1v j1 + U i2v j2
2 e)the 2-dimensional correspondence plot displays Ui2 against Ui1 and Vi2 against Vi1 on the same graph (Note: because the distance between points will be of interest, it is important that the plot is scaled so that the visual distance is proportionately correct) Some notes: V11 Vj1 Vc1 V 12 Vj2 Vc2 R 1 j c U11 1 U11V 11 U11V j1 Ui1 i Ui1V U 11 i1v j1 r Ur1V U r1 j1 11 U r1 V R U (1) V T (1) + U (2) V (2) U 11V c1 U i1 V c1 U r1v c1 1 j c U12 1 U12V 12 U12V j2 Ui2 i Ui2 V U 12 i2 V j2 r Ur2V Q: what does a large positive R ij mean? a large negative R ij? k d k2 Pearson s X 2 (because ij r 2 p trace(r T R) k d k2 ) Q: what should we look for in a correspondence plot? Large values in U (k) (and V (k) ) In the contingency table, the profiles of the rows (or the columns) corresponding to the large values are different U r2 12 Ur2V j2 U 1k Uik Urk T,whereU (k) = and V (k) = p U 12 V c2 U i2 V c2 Ur2V c2 V1k E.g.: BLOND hair the distribution of eye colors within this group is not typical E.g.: BROWN hair the distribution of eye colors within this group close to the marginal distribution of columns Row and column levels appear close together and far from the origin A large positive R ij would be associated with the combination E.g.: BLOND hair blue eye strong association Row and column levels situate diametrically apart on either side of the origin A large negative R ij would be associated with the combination E.g.: BLOND hair brown eye relatively fewer people Points of two row (or two column) levels are close together The two rows/columns have a similar pattern of association might consider to combine the two categories E.g.: hazel eye green eye similar hair color distribution Other methods: corresp in the MASS package of R (Venables and Ripley, 22), Blasius and Greenacre (1998) Reading: Faraway, 4.2 V jk V ck p. 5-14
3 Matched Pairs Data: observe one categorical measure on two matched objects E.g.: left and right eye performance of a person In contrast, in the typical 2-way contingency table, observe two (different) categorical p I X I 1 I I1 II I 1 I 1 measures on one object Q: what questions we might be interested in for matched pair data? and X 2 are independent, i.e., ij i j for all i and j? [ ij ] I I is a symmetric matrix, i.e., ij ji? row and column marginals are homogeneous, i.e., i i? Symmetry implies marginal homogeneity (the reverse statement is not necessarily true) When row and column marginal totals are quite different, we might be interested in whether ij i j ij, where ij ji? The hypothesis is called quasi-symmetry Marginal homogeneity quasi-symmetric symmetry Whether ij i j for i j? it is called quasi-independent Tests for these hypotheses based on GLM, e.g., Y =(y 11,y 21,y 31,y 12,y 22,y 32,y 13,y 23,y 33 ) T Test for symmetry hypothesis: Generate a vector with I 2 components for a (I(I+1)/2)-level nomial factor with the structure: symfactor (l 1,l 2,l 3,l 2,l 4,l 5,l 3,l 5,l 6 ) T Y ~ symfactor S sym X p y 11 y 12 y 13 y 1 2 y 21 y 22 y 23 y 2 3 y 31 y 32 y 33 y 3 y 1 y 2 y 3 y Deviance-based/Pearson X 2 goodness-of-fit test for S sym Test for quasi-symmetric hypothesis log(π ij )=log(π i+ π +j γ ij )=log(π i+ ) + log(π +j )+log(γ ij ) Y ~ + X 2 + symfactor S qsym Deviance-based/Pearson X 2 goodness-of-fit test for S qsym Test for marginal homogeneity hypothesis Deviance-based test for H : S sym v.s. H 1 :S qsym \S sym The test is only appropriate when S qsym already holds Test for quasi-independent hypothesis Omit the diagonal data, i.e., Y =(y 21,y 31,y 12,y 32,y 13,y 23 ) T ~ + X 2 S qindep Y Reading: Faraway, 4.3 Deviance-based/Pearson X 2 goodness-of-fit test for S qindep
4 Three-Way Contingency Table The s and y s are defined in the same manner as in the 2-way table Poisson GLM approach to investigate how, X 2, X 3 interact Mutual independence (, X 2, X 3 are independent) ijk i j k log(π ijk )=log(π i++ π +j+ π ++k ) =log(π i++ )+log(π +j+ )+log(π ++k ) Y X 2 X 3 S 1 The estimates of parameters in this model (1 i I) correspond only to the marginal totals y i, y j, and y k The coding we use will determine exactly how the parameters relate to the margin totals, e.g., let be an main effect of that codes i 1 and i 2 categories as and 1 Insignificant factor, say 1 2 I Joint independence ({, X 2 } and X 3 are independent) ijk ij k ij k ij log(π ijk )=log(π ij+ π ++k )=log(π ij+ )+log(π ++k ) Y X 2 X 2 X 3 S 2 ( S 1 ) X 3 (1 k K) X 2 (1 j J) e ˆβ/(1 + e ˆβ) =ˆπ i2 ++/(ˆπ i1 ++ +ˆπ i2 ++) =y i2 ++/(y i y i2 ++) p Conditional independence (, X 2 are independent given X 3 ) p ij k i k j k ijk i k jk k log(π ijk )=log(π i+k π +jk /π ++k ) =log(π i+k )+log(π +jk ) log(π ++k ) Y X 3 X 3 X 2 X 2 X 3 S 3 Note that S 3 + S 2, but the condition that {, X 3 } and X 2 are independent implies that and X 2 are independent given X 3 Q: can the conditional independence imply independence between and X 2, i.e., ij+ i++ +j+? (Hint: singular value decomposition) Uniform association Consider a model with all two-factor interactions Y X 2 X 3 X 2 X 3 X 2 X 3 S 4 ( S 3 ) S 4 is not saturated some degrees of freedoms left for goodness-of-fit test S 4 has no simple interpretation in terms of independence S 4 asserts that for every level of one variable, say X 3, we have the same association between and X 2
5 p For each levels of X3, the reduced models of S4 have different coefficients for the main effects of X1 and X2, but have the same coefficients for the interaction X1:X2 E.g., I J 2, same fitted odds-ratio between X1 and X2 for each category of X3. Note that: y y 22k π 11k π 22k β 12k = = e fitted odd-ratio = y 11k, where y π π 12k 21k 12k 21k 12k is the coefficient of the X1 X2 term (under the -1 coding) in the reduced model of X3 k Q: What does uniform association mean? How to interpret the association? How does it connect with interaction terms? p. 5-2 A saturated model corresponds to a 3-way table with different association between, say X1 and X2, across K levels of X3 whereas Y~1 corresponds to a 3-way table with constant Q: how to examine whether the X1, X2, X3 in a 3-way table are mutually independent, jointly independent, conditionally independent, or uniformly associated? 2 Ans: Perform deviance-based/pearson s X goodness-of-fit tests for S1, S2, S3, S4, respectively.
6 p However, be careful of zero or small y ijk there will be some doubt as to the accuracy of the chi-square approximation in goodness-of-fit test The chi-square approximation is better in comparing model than assessing goodness-of-fit Analysis strategy: start with complex Poisson GLM (such as saturated model) and see how far the model can be reduced (by using deviance-based test to compare models). Binomial (multinomial) GLM approach for 3-way table When y ij s are regarded as fixed, we can treat Y X3 as a response and, X 2 as covariates Q 1 : what information gone? Q 2 : what information still attainable? Ans for Q 1 : information about ij Ans for Q 2 : information about k ij Y X3 y ij1 ~ binomial(y ij+, k 1 ij ) if K=2 Y X3 (y ij1,, y ijk ) ~ multinomial(y ij+, k 1 ij,, k K ij ) if K > 2
Loglinear models. STAT 526 Professor Olga Vitek
Loglinear models STAT 526 Professor Olga Vitek April 19, 2011 8 Can Use Poisson Likelihood To Model Both Poisson and Multinomial Counts 8-1 Recall: Poisson Distribution Probability distribution: Y - number
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 7 Contingency Table 0-0 Outline Introduction to Contingency Tables Testing Independence in Two-Way Contingency Tables Modeling Ordinal Associations
More informationSolution to Tutorial 7
1. (a) We first fit the independence model ST3241 Categorical Data Analysis I Semester II, 2012-2013 Solution to Tutorial 7 log µ ij = λ + λ X i + λ Y j, i = 1, 2, j = 1, 2. The parameter estimates are
More informationCategorical Variables and Contingency Tables: Description and Inference
Categorical Variables and Contingency Tables: Description and Inference STAT 526 Professor Olga Vitek March 3, 2011 Reading: Agresti Ch. 1, 2 and 3 Faraway Ch. 4 3 Univariate Binomial and Multinomial Measurements
More informationMSH3 Generalized linear model
Contents MSH3 Generalized linear model 7 Log-Linear Model 231 7.1 Equivalence between GOF measures........... 231 7.2 Sampling distribution................... 234 7.3 Interpreting Log-Linear models..............
More informationExtended Mosaic and Association Plots for Visualizing (Conditional) Independence. Achim Zeileis David Meyer Kurt Hornik
Extended Mosaic and Association Plots for Visualizing (Conditional) Independence Achim Zeileis David Meyer Kurt Hornik Overview The independence problem in -way contingency tables Standard approach: χ
More informationAnalysis of data in square contingency tables
Analysis of data in square contingency tables Iva Pecáková Let s suppose two dependent samples: the response of the nth subject in the second sample relates to the response of the nth subject in the first
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as
page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationVisualizing Independence Using Extended Association and Mosaic Plots. Achim Zeileis David Meyer Kurt Hornik
Visualizing Independence Using Extended Association and Mosaic Plots Achim Zeileis David Meyer Kurt Hornik Overview The independence problem in 2-way contingency tables Standard approach: χ 2 test Alternative
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3
More informationPhD Qualifying Examination Department of Statistics, University of Florida
PhD Qualifying xamination Department of Statistics, University of Florida January 24, 2003, 8:00 am - 12:00 noon Instructions: 1 You have exactly four hours to answer questions in this examination 2 There
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More information11-2 Multinomial Experiment
Chapter 11 Multinomial Experiments and Contingency Tables 1 Chapter 11 Multinomial Experiments and Contingency Tables 11-11 Overview 11-2 Multinomial Experiments: Goodness-of-fitfit 11-3 Contingency Tables:
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationContingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878
Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each
More informationHomework 9 Sample Solution
Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold
More informationINTRODUCTION TO LOG-LINEAR MODELING
INTRODUCTION TO LOG-LINEAR MODELING Raymond Sin-Kwok Wong University of California-Santa Barbara September 8-12 Academia Sinica Taipei, Taiwan 9/8/2003 Raymond Wong 1 Hypothetical Data for Admission to
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationGeneralized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationSingle-level Models for Binary Responses
Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationPrincipal Component Analysis for Mixed Quantitative and Qualitative Data
Principal Component Analysis for Mixed Quantitative and Qualitative Data Susana Agudelo-Jaramillo Manuela Ochoa-Muñoz Tutor: Francisco Iván Zuluaga-Díaz EAFIT University Medelĺın-Colombia Research Practise
More informationCategorical data analysis Chapter 5
Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationContingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.
Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,
More informationANOVA: Analysis of Variance - Part I
ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.
More informationModule 10: Analysis of Categorical Data Statistics (OA3102)
Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this
More informationECE 5615/4615 Computer Project
Set #1p Due Friday March 17, 017 ECE 5615/4615 Computer Project The details of this first computer project are described below. This being a form of take-home exam means that each person is to do his/her
More informationML Testing (Likelihood Ratio Testing) for non-gaussian models
ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationSTAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).
STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationThe material for categorical data follows Agresti closely.
Exam 2 is Wednesday March 8 4 sheets of notes The material for categorical data follows Agresti closely A categorical variable is one for which the measurement scale consists of a set of categories Categorical
More information2.3 Analysis of Categorical Data
90 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING 2.3 Analysis of Categorical Data 2.3.1 The Multinomial Probability Distribution A mulinomial random variable is a generalization of the binomial rv. It results
More informationSleep data, two drugs Ch13.xls
Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00
Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section
More informationStat 5421 Lecture Notes Simple Chi-Square Tests for Contingency Tables Charles J. Geyer March 12, 2016
Stat 5421 Lecture Notes Simple Chi-Square Tests for Contingency Tables Charles J. Geyer March 12, 2016 1 One-Way Contingency Table The data set read in by the R function read.table below simulates 6000
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationSTA 450/4000 S: January
STA 450/4000 S: January 6 005 Notes Friday tutorial on R programming reminder office hours on - F; -4 R The book Modern Applied Statistics with S by Venables and Ripley is very useful. Make sure you have
More informationINFORMATION THEORY AND STATISTICS
INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''.
More informationChapter 10. Discrete Data Analysis
Chapter 1. Discrete Data Analysis 1.1 Inferences on a Population Proportion 1. Comparing Two Population Proportions 1.3 Goodness of Fit Tests for One-Way Contingency Tables 1.4 Testing for Independence
More informationMinimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA
Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationLing 289 Contingency Table Statistics
Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,
More informationRepeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models
Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationChi-Squared Tests. Semester 1. Chi-Squared Tests
Semester 1 Goodness of Fit Up to now, we have tested hypotheses concerning the values of population parameters such as the population mean or proportion. We have not considered testing hypotheses about
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationConceptual Models for Visualizing Contingency Table Data
Conceptual Models for Visualizing Contingency Table Data Michael Friendly York University 1 Introduction For some time I have wondered why graphical methods for categorical data are so poorly developed
More information1 Inner Product and Orthogonality
CSCI 4/Fall 6/Vora/GWU/Orthogonality and Norms Inner Product and Orthogonality Definition : The inner product of two vectors x and y, x x x =.., y =. x n y y... y n is denoted x, y : Note that n x, y =
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationSolutions for Examination Categorical Data Analysis, March 21, 2013
STOCKHOLMS UNIVERSITET MATEMATISKA INSTITUTIONEN Avd. Matematisk statistik, Frank Miller MT 5006 LÖSNINGAR 21 mars 2013 Solutions for Examination Categorical Data Analysis, March 21, 2013 Problem 1 a.
More information36-720: Log-Linear Models: Three-Way Tables
36-720: Log-Linear Models: Three-Way Tables Brian Junker September 5, 2007 Aggregation and Association Modelling the Three-Way Table The Model of Complete Independence Models with One Factor Independent
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationHYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC
1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare
More informationPreface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of
Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures
More informationforms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms
Christopher Engström November 14, 2014 Hermitian LU QR echelon Contents of todays lecture Some interesting / useful / important of matrices Hermitian LU QR echelon Rewriting a as a product of several matrices.
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38
BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationReview of One-way Tables and SAS
Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationFundamentals of Engineering Analysis (650163)
Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is
More informationLecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk
More informationVarious Issues in Fitting Contingency Tables
Various Issues in Fitting Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Complete Tables with Zero Entries In contingency tables, it is possible to have zero entries in a
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationReview of Linear Algebra
Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources
More informationMS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II
MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II the Contents the the the Independence The independence between variables x and y can be tested using.
More informationGeneralized Linear Models
York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear
More informationLinear Algebra (Review) Volker Tresp 2017
Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.
More informationMa/CS 6b Class 20: Spectral Graph Theory
Ma/CS 6b Class 20: Spectral Graph Theory By Adam Sheffer Eigenvalues and Eigenvectors A an n n matrix of real numbers. The eigenvalues of A are the numbers λ such that Ax = λx for some nonzero vector x
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More informationStatistics - Lecture 04
Statistics - Lecture 04 Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/2018-2019/cours/04/index.html#40 1/40 Correlation In many situations the objective
More informationStat 315c: Transposable Data Rasch model and friends
Stat 315c: Transposable Data Rasch model and friends Art B. Owen Stanford Statistics Art B. Owen (Stanford Statistics) Rasch and friends 1 / 14 Categorical data analysis Anova has a problem with too much
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationRegression #5: Confidence Intervals and Hypothesis Testing (Part 1)
Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #5 1 / 24 Introduction What is a confidence interval? To fix ideas, suppose
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationlinearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice
3. Eigenvalues and Eigenvectors, Spectral Representation 3.. Eigenvalues and Eigenvectors A vector ' is eigenvector of a matrix K, if K' is parallel to ' and ' 6, i.e., K' k' k is the eigenvalue. If is
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More informationCorrespondence Analysis
STATGRAPHICS Rev. 7/6/009 Correspondence Analysis The Correspondence Analysis procedure creates a map of the rows and columns in a two-way contingency table for the purpose of providing insights into the
More information