Principal Component Analysis for Mixed Quantitative and Qualitative Data
|
|
- Erin Reynolds
- 5 years ago
- Views:
Transcription
1 Principal Component Analysis for Mixed Quantitative and Qualitative Data Susana Agudelo-Jaramillo Manuela Ochoa-Muñoz Tutor: Francisco Iván Zuluaga-Díaz EAFIT University Medelĺın-Colombia Research Practise April 12th, 2016
2 Mixed Quantitative and Qualitative Data Principal Component Analysis for Mixed Quantitative and Qualitative Data 1/23
3 Correspondence Analysis It is a graphical technique to represent information of a contingency table with two inputs, which contains the count of elements for crossclassification of two categorical variables. These tables are based on two qualitative nominal or ordinal variables where categories of one variable appear in rows and other variable categories are represented in columns [de la Fuente Fernández, 2011]. Correspondence analysis can be useful to identify categories that are similar, which therefore can be combined. Principal Component Analysis for Mixed Quantitative and Qualitative Data 2/23
4 Example The following example illustrated how a quantification matrix works for a sample of 12 people and 4 categorical variables. Figure 1: Categories for the four variables taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 3/23
5 Example Figure 2: List of 12 people and their categories on four variables taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 4/23
6 Example Figure 3: Correspondence analysis of the four variables Principal Component Analysis for Mixed Quantitative and Qualitative Data 5/23
7 Indicator Matrix S ij = 1 if object i belongs to the category of the variable j 0 if object i does not belong to the category of the variable j Figure 4: Indicator matrix G for the data taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 6/23
8 Burt Matrix From the indicator matrix G we can get the G G matrix known as the Burt matrix. Figure 5: Burt Matrix G G for the matrix G taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 7/23
9 Burt Matrix In the diagonal blocks appear matrices containing the marginal frequencies of each of the variables analyzed. Outside the diagonal appear contingency tables of frequencies corresponding to all combinations 2 to 2 of the variables analyzed. Figure 6: Part of the contingency tables for variables Gender and Age Principal Component Analysis for Mixed Quantitative and Qualitative Data 8/23
10 Quantification Matrices Quantification matrices transform qualitative data into components which facilitates the analysis of results. The idea of using quantification matrices is to define correlation coefficients. The quantification matrices are used to measure similarity and dissimilarity between the objects respect to a variable. Principal Component Analysis for Mixed Quantitative and Qualitative Data 9/23
11 Quantification Matrix G j G j The elements of the quantification matrix G j G j are given by: S ii j = 1 if object i and object i belong to the same category 0 if object i and object i belong to different category S ii j it is a measure of similarity between sample objects i and i in terms of a particular variable j. The frequency categories and the number of categories are not taken into account in this measure of similarity [Kiers, 1989]. Principal Component Analysis for Mixed Quantitative and Qualitative Data 10/23
12 Quantification Matrix G j G j Table 1: Quantification matrix GG of hair color variable Hair Color Principal Component Analysis for Mixed Quantitative and Qualitative Data 11/23
13 Example Figure 7: List of 12 people and their categories on four variables taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 12/23
14 Quantification Matrix G j (G j G j) 1 G j In this case Burt matrix inverted is added: Table 2: Burt matrix inverted of hair color variable Hair Color Blond Brown Black Red Blond Brown Black Red Principal Component Analysis for Mixed Quantitative and Qualitative Data 13/23
15 Quantification Matrix G j (G j G j) 1 G j The elements of the quantification matrix G j (G j G j) 1 G j are given by: S ii j = f 1 g if object i and object i belong to the same category 0 if object i and object i belong to different category where f 1 g is the g th diagonal element of (G j G j) 1 [Kiers, 1989]. Principal Component Analysis for Mixed Quantitative and Qualitative Data 14/23
16 Quantification Matrix G j (G j G j) 1 G j Table 3: Quantification matrix G(G G) 1 G of hair color variable Hair Color Principal Component Analysis for Mixed Quantitative and Qualitative Data 15/23
17 Example Figure 8: List of 12 people and their categories on four variables taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 16/23
18 Quantification Matrix JG j (G j G j) 1 G j J Here the J matrix is added: J=I n 11 n where I n is the identity matrix, 1 is an ones vector and n is the sample size. Principal Component Analysis for Mixed Quantitative and Qualitative Data 17/23
19 Quantification Matrix JG j (G j G j) 1 G j J Table 4: J matrix J Matrix Principal Component Analysis for Mixed Quantitative and Qualitative Data 18/23
20 Quantification Matrix JG j (G j G j) 1 G j J This quantification matrix is a normalized version of the χ 2 measure. Where χ 2 = 0 if variables are statistically independent [Kiers, 1989]. The elements of the quantification matrix JG j (G j G j) 1 G j J are given by: S ii j = f 1 g n 1 if object i and object i belong to the same category n 1 if object i and object i belong to different category Principal Component Analysis for Mixed Quantitative and Qualitative Data 19/23
21 Quantification Matrix JG j (G j G j) 1 G j J Table 5: Quantification matrix JG(G G) 1 G J of hair color variable Hair Color Principal Component Analysis for Mixed Quantitative and Qualitative Data 20/23
22 Example Figure 9: List of 12 people and their categories on four variables taken from [Rencher, 1934] Principal Component Analysis for Mixed Quantitative and Qualitative Data 21/23
23 References de la Fuente Fernández, S. (2011). Análisis correspondencias simples y múltiples. Universidad Autónoma de Madrid, pages 1 9. Kiers, H. (1989). Three-way methods for the analysis of qualitative and quantitative two-way data. PhD thesis. Rencher, A. C. (1934). Methods of Multivariate Analysis. Wiley Series in Probability and Statistics. Principal Component Analysis for Mixed Quantitative and Qualitative Data 22/23
24 THANKS FOR YOUR ATTENTION Principal Component Analysis for Mixed Quantitative and Qualitative Data 23/23
Principal Component Analysis for Mixed Quantitative and Qualitative Data
1 Principal Component Analysis for Mixed Quantitative and Qualitative Data Susana Agudelo-Jaramillo sagudel9@eafit.edu.co Manuela Ochoa-Muñoz mochoam@eafit.edu.co Francisco Iván Zuluaga-Díaz fzuluag2@eafit.edu.co
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More informationExtended Mosaic and Association Plots for Visualizing (Conditional) Independence. Achim Zeileis David Meyer Kurt Hornik
Extended Mosaic and Association Plots for Visualizing (Conditional) Independence Achim Zeileis David Meyer Kurt Hornik Overview The independence problem in -way contingency tables Standard approach: χ
More informationRelate Attributes and Counts
Relate Attributes and Counts This procedure is designed to summarize data that classifies observations according to two categorical factors. The data may consist of either: 1. Two Attribute variables.
More informationCorrespondence Analysis
Correspondence Analysis Q: when independence of a 2-way contingency table is rejected, how to know where the dependence is coming from? The interaction terms in a GLM contain dependence information; however,
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationMultiple Correspondence Analysis
Multiple Correspondence Analysis 18 Up to now we have analysed the association between two categorical variables or between two sets of categorical variables where the row variables are different from
More informationMS-E2112 Multivariate Statistical Analysis (5cr) Lecture 5: Bivariate Correspondence Analysis
MS-E2112 Multivariate Statistical (5cr) Lecture 5: Bivariate Contents analysis is a PCA-type method appropriate for analyzing categorical variables. The aim in bivariate correspondence analysis is to
More informationCorrespondence Analysis of Longitudinal Data
Correspondence Analysis of Longitudinal Data Mark de Rooij* LEIDEN UNIVERSITY, LEIDEN, NETHERLANDS Peter van der G. M. Heijden UTRECHT UNIVERSITY, UTRECHT, NETHERLANDS *Corresponding author (rooijm@fsw.leidenuniv.nl)
More informationAnalysis of data in square contingency tables
Analysis of data in square contingency tables Iva Pecáková Let s suppose two dependent samples: the response of the nth subject in the second sample relates to the response of the nth subject in the first
More informationVisualizing Independence Using Extended Association and Mosaic Plots. Achim Zeileis David Meyer Kurt Hornik
Visualizing Independence Using Extended Association and Mosaic Plots Achim Zeileis David Meyer Kurt Hornik Overview The independence problem in 2-way contingency tables Standard approach: χ 2 test Alternative
More informationLecture 28 Chi-Square Analysis
Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not
More informationSTAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).
STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example
More informationInference for Categorical Data. Chi-Square Tests for Goodness of Fit and Independence
Chi-Square Tests for Goodness of Fit and Independence Chi-Square Tests In this course, we use chi-square tests in two different ways The chi-square test for goodness-of-fit is used to determine whether
More information11-2 Multinomial Experiment
Chapter 11 Multinomial Experiments and Contingency Tables 1 Chapter 11 Multinomial Experiments and Contingency Tables 11-11 Overview 11-2 Multinomial Experiments: Goodness-of-fitfit 11-3 Contingency Tables:
More information8/4/2009. Describing Data with Graphs
Describing Data with Graphs 1 A variable is a characteristic that changes or varies over time and/or for different individuals or objects under consideration. Examples: Hair color, white blood cell count,
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationContingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878
Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each
More informationTopic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!
Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of
More informationStatistics 3858 : Contingency Tables
Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson
More informationTopic 21 Goodness of Fit
Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known
More informationConsideration on Sensitivity for Multiple Correspondence Analysis
Consideration on Sensitivity for Multiple Correspondence Analysis Masaaki Ida Abstract Correspondence analysis is a popular research technique applicable to data and text mining, because the technique
More informationNumber of cases (objects) Number of variables Number of dimensions. n-vector with categorical observations
PRINCALS Notation The PRINCALS algorithm was first described in Van Rickevorsel and De Leeuw (1979) and De Leeuw and Van Rickevorsel (1980); also see Gifi (1981, 1985). Characteristic features of PRINCALS
More informationANÁLISE DOS DADOS. Daniela Barreiro Claro
ANÁLISE DOS DADOS Daniela Barreiro Claro Outline Data types Graphical Analysis Proimity measures Prof. Daniela Barreiro Claro Types of Data Sets Record Ordered Relational records Video data: sequence of
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationIntroduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution
Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis
More informationInferential statistics
Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,
More information1. Introduction to Multivariate Analysis
1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with
More informationModule 10: Analysis of Categorical Data Statistics (OA3102)
Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this
More informationSTATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS
STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More information22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices
m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix
More informationA Weighted Score Derived from a Multiple Correspondence
Int Statistical Inst: Proc 58th World Statistical Congress 0 Dublin (Session CPS06) p4 A Weighted Score Derived from a Multiple Correspondence Analysis Solution de Souza Márcio L M Universidade Federal
More informationML Testing (Likelihood Ratio Testing) for non-gaussian models
ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationHYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC
1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare
More informationContingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.
Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,
More informationEvaluation of scoring index with different normalization and distance measure with correspondence analysis
Evaluation of scoring index with different normalization and distance measure with correspondence analysis Anders Nilsson Master Thesis in Statistics 15 ECTS Spring Semester 2010 Supervisors: Professor
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 6 (MWF) Conditional probabilities and associations Suhasini Subba Rao Review of previous lecture
More informationChapter 2 Nonlinear Principal Component Analysis
Chapter 2 Nonlinear Principal Component Analysis Abstract Principal components analysis (PCA) is a commonly used descriptive multivariate method for handling quantitative data and can be extended to deal
More informationChi-square (χ 2 ) Tests
Math 442 - Mathematical Statistics II April 30, 2018 Chi-square (χ 2 ) Tests Common Uses of the χ 2 test. 1. Testing Goodness-of-fit. 2. Testing Equality of Several Proportions. 3. Homogeneity Test. 4.
More informationStatistics I Chapter 3: Bivariate data analysis
Statistics I Chapter 3: Bivariate data analysis Chapter 3: Bivariate data analysis Contents 3.1 Two-way tables Bivariate data Definition of a two-way table Joint absolute/relative frequency distribution
More informationABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY
ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY José A. Díaz-García and Raúl Alberto Pérez-Agamez Comunicación Técnica No I-05-11/08-09-005 (PE/CIMAT) About principal components under singularity José A.
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationResearch Methodology: Tools
MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 05: Contingency Analysis March 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi Bruderer Enzler Contents Slide
More informationseries. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques..
1 Use computational techniques and algebraic skills essential for success in an academic, personal, or workplace setting. (Computational and Algebraic Skills) MAT 203 MAT 204 MAT 205 MAT 206 Calculus I
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationCHAPTER 2. Types of Effect size indices: An Overview of the Literature
CHAPTER Types of Effect size indices: An Overview of the Literature There are different types of effect size indices as a result of their different interpretations. Huberty (00) names three different types:
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationTHE GIFI SYSTEM OF DESCRIPTIVE MULTIVARIATE ANALYSIS
THE GIFI SYSTEM OF DESCRIPTIVE MULTIVARIATE ANALYSIS GEORGE MICHAILIDIS AND JAN DE LEEUW ABSTRACT. The Gifi system of analyzing categorical data through nonlinear varieties of classical multivariate analysis
More informationMultivariate Statistics Fundamentals Part 1: Rotation-based Techniques
Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques A reminded from a univariate statistics courses Population Class of things (What you want to learn about) Sample group representing
More informationStatistics for Managers Using Microsoft Excel
Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap
More informationCorrespondence analysis. Correspondence analysis: Basic ideas. Example: Hair color, eye color data
Correspondence analysis Correspondence analysis: Michael Friendly Psych 6136 October 9, 17 Correspondence analysis (CA) Analog of PCA for frequency data: account for maximum % of χ 2 in few (2-3) dimensions
More informationChi-square (χ 2 ) Tests
Math 145 - Elementary Statistics April 17, 2007 Common Uses of the χ 2 test. 1. Testing Goodness-of-fit. Chi-square (χ 2 ) Tests 2. Testing Equality of Several Proportions. 3. Homogeneity Test. 4. Testing
More informationData Mining 4. Cluster Analysis
Data Mining 4. Cluster Analysis 4.2 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Data Structures Interval-Valued (Numeric) Variables Binary Variables Categorical Variables Ordinal Variables Variables
More informationIs there a connection between gender, maths grade, hair colour and eye colour? Contents
5 Sample project This Maths Studies project has been graded by a moderator. As you read through it, you will see comments from the moderator in boxes like this: At the end of the sample project is a summary
More information12 Chi-squared (χ 2 ) Tests for Goodness-of-fit and Independence
12 Chi-squared (χ 2 ) Tests for Goodness-of-fit and Independence The chi-squared tests are for H 0 : The frequency distribution of events observed in a sample is with a particular distribution against
More informationMath 201 Statistics for Business & Economics. Definition of Statistics. Two Processes that define Statistics. Dr. C. L. Ebert
Math 201 Statistics for Business & Economics Dr. C. L. Ebert Chapter 1 Introduction Definition of Statistics Statistics - the study of the collection, organization, presentation, and characterization of
More informationMinimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview
Symmetry 010,, 1108-110; doi:10.3390/sym01108 OPEN ACCESS symmetry ISSN 073-8994 www.mdpi.com/journal/symmetry Review Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon
More informationSimilarity and Dissimilarity
1//015 Similarity and Dissimilarity COMP 465 Data Mining Similarity of Data Data Preprocessing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed.
More informationData Analysis as a Decision Making Process
Data Analysis as a Decision Making Process I. Levels of Measurement A. NOIR - Nominal Categories with names - Ordinal Categories with names and a logical order - Intervals Numerical Scale with logically
More informationEcon 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis
Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide
More informationS1600 #2. Data Presentation #1. January 14, 2016
S1600 #2 Data Presentation #1 January 14, 2016 Outline 1 Data Presentation #1 Statistics and Data Variable Types Summarizing Categorical Data (WMU) S1600 #2 S1600, Lecture 2 2 / 14 Statistics and Data
More informationMARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES
REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of
More informationNOMINAL VARIABLE CLUSTERING AND ITS EVALUATION
NOMINAL VARIABLE CLUSTERING AND ITS EVALUATION Hana Řezanková Abstract The paper evaluates clustering of nominal variables using different similarity measures. The created clusters can serve for dimensionality
More informationEasy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix
Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo
More informationTextbook Examples of. SPSS Procedure
Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of
More informationLecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests
Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.
More informationINFORMATION THEORY AND STATISTICS
INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''.
More informationWorked Examples for Nominal Intercoder Reliability. by Deen G. Freelon October 30,
Worked Examples for Nominal Intercoder Reliability by Deen G. Freelon (deen@dfreelon.org) October 30, 2009 http://www.dfreelon.com/utils/recalfront/ This document is an excerpt from a paper currently under
More informationHypothesis Testing: Chi-Square Test 1
Hypothesis Testing: Chi-Square Test 1 November 9, 2017 1 HMS, 2017, v1.0 Chapter References Diez: Chapter 6.3 Navidi, Chapter 6.10 Chapter References 2 Chi-square Distributions Let X 1, X 2,... X n be
More informationSection 2.1 ~ Data Types and Levels of Measurement. Introduction to Probability and Statistics Spring 2017
Section 2.1 ~ Data Types and Levels of Measurement Introduction to Probability and Statistics Spring 2017 Objective To be able to classify data as qualitative or quantitative, to identify quantitative
More informationFrequency Distribution Cross-Tabulation
Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More information7.6 The Inverse of a Square Matrix
7.6 The Inverse of a Square Matrix Copyright Cengage Learning. All rights reserved. What You Should Learn Verify that two matrices are inverses of each other. Use Gauss-Jordan elimination to find inverses
More informationAverage weight of Eisenhower dollar: 23 grams
Average weight of Eisenhower dollar: 23 grams Average cost of dinner in Decatur: 23 dollars Would it be more surprising to see A dinner that costs more than 27 dollars, or An Eisenhower dollar that weighs
More informationMultiplying matrices by diagonal matrices is faster than usual matrix multiplication.
7-6 Multiplying matrices by diagonal matrices is faster than usual matrix multiplication. The following equations generalize to matrices of any size. Multiplying a matrix from the left by a diagonal matrix
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More information1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables
1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables
More informationStatistics - Lecture 04
Statistics - Lecture 04 Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/2018-2019/cours/04/index.html#40 1/40 Correlation In many situations the objective
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationVALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION. Carol Lindee
VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION Carol Lindee LindeeEmail@netscape.net (708) 479-3764 Nick Thomopoulos Illinois Institute of Technology Stuart
More informationThe goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.
The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining
More informationMATH 323 Linear Algebra Lecture 6: Matrix algebra (continued). Determinants.
MATH 323 Linear Algebra Lecture 6: Matrix algebra (continued). Determinants. Elementary matrices Theorem 1 Any elementary row operation σ on matrices with n rows can be simulated as left multiplication
More informationDECISION MAKING SUPPORT AND EXPERT SYSTEMS
325 ITHEA DECISION MAKING SUPPORT AND EXPERT SYSTEMS UTILITY FUNCTION DESIGN ON THE BASE OF THE PAIRED COMPARISON MATRIX Stanislav Mikoni Abstract: In the multi-attribute utility theory the utility functions
More informationLecture 21 Comparing Counts - Chi-square test
Lecture 21 Comparing Counts - Chi-square test Thais Paiva STA 111 - Summer 2013 Term II August 5, 2013 1 / 20 Thais Paiva STA 111 - Summer 2013 Term II Lecture 21, 08/05/2013 Lecture Plan 1 Goodness of
More informationNegative Multinomial Model and Cancer. Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationLinear Algebra Solutions 1
Math Camp 1 Do the following: Linear Algebra Solutions 1 1. Let A = and B = 3 8 5 A B = 3 5 9 A + B = 9 11 14 4 AB = 69 3 16 BA = 1 4 ( 1 3. Let v = and u = 5 uv = 13 u v = 13 v u = 13 Math Camp 1 ( 7
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationPrincipal Component Analysis vs. Independent Component Analysis for Damage Detection
6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR
More informationLecture Notes 2: Variables and graphics
Highlights: Lecture Notes 2: Variables and graphics Quantitative vs. qualitative variables Continuous vs. discrete and ordinal vs. nominal variables Frequency distributions Pie charts Bar charts Histograms
More informationSingular value decomposition (SVD) of large random matrices. India, 2010
Singular value decomposition (SVD) of large random matrices Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu India, 2010 Motivation New challenge of multivariate statistics:
More informationMAT 1332: CALCULUS FOR LIFE SCIENCES. Contents. 1. Review: Linear Algebra II Vectors and matrices Definition. 1.2.
MAT 1332: CALCULUS FOR LIFE SCIENCES JING LI Contents 1 Review: Linear Algebra II Vectors and matrices 1 11 Definition 1 12 Operations 1 2 Linear Algebra III Inverses and Determinants 1 21 Inverse Matrices
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationFREQUENCY DISTRIBUTIONS AND PERCENTILES
FREQUENCY DISTRIBUTIONS AND PERCENTILES New Statistical Notation Frequency (f): the number of times a score occurs N: sample size Simple Frequency Distributions Raw Scores The scores that we have directly
More informationEigenvalues and Eigenvectors
Eigenvalues and Eigenvectors Definition 0 Let A R n n be an n n real matrix A number λ R is a real eigenvalue of A if there exists a nonzero vector v R n such that A v = λ v The vector v is called an eigenvector
More information