Jerome Kaltenhauser and Yuk Lee
|
|
- Muriel Bradley
- 5 years ago
- Views:
Transcription
1 Correlation Coefficients for Binary Data In Factor Analysis Jerome Kaltenhauser and Yuk Lee The most commonly used factor analytic models employ a symmetric matrix of correlation coefficients as input. The Pearson product moment coefficient is appropriate when the variables are continuous. For binary (two-valued) data there is no single agreed-upon coefficient [ 71. Three coefficients-phi, phi/phimax, and tetrachoric-are frequently discussed in the literature and are the focus of the present investigation. It is assumed that the coefficient is to be used in factor analysis and specifically in exploratory factor analyses [ 31. The characteristics of these coefficients have been discussed by others [Z, 4, 81 and will be quickly summarized here. It is convenient to discuss the coefficients in terms of a two-way table relating the occurrences of the values of the binary variables x and y to be correlated. X\Y C d c+d 1 b a b+a c+b d+a 1.0 a, b, c, and d are the joint frequencies of combinations of values of xand y, while c + d, a + b, b + c, and a + dare the marginal frequencies or proportions for y and x. Despite the special name and numerous expressions available for it, phi is simply the Pearson product moment formula applied to binary data. That is, phi = r= xty, (1) where x is the vector of standardized values of variable x (and xt is its transpose) and similarly for y; r is the product moment correlation coefficient. A defect of phi is that it can achieve the range -1.0 to 1.0 only under rare circumstances, when a+b a+d -=c+d b+c', Jerome Kaltenhauser is systems analyst, Computing Center, and Yuk Lee is associate professor of geogmphy, University of Colorado.
2 306 / Geographical Analysis Thus, phi will usually suffer some restriction in range. The simplest way to bring phi up to full range is to normalize it, and this produces phi/phimax. Phi/phimax is obtained by dividing phi by the maximum value it could assume consistent with the set of marginals from its two-way table: Pl 9 s phimax = - -, 91 Ps (3) where p, is the largest marginal in the table, p, is the second largest, and p + q = 1.0 [2]. The tetrachoric coefficient was derived on the assumption that the observed frequencies in the two-way table express an underlying bivariate normal distribution. If a bivariate normal distribution is divided into quadrants by partitions parallel to the x- and y-axes, then the tetrachoric is a maximum likelihood estimate-using only the quadrant frequencies-of the product moment coefficient which would be calculated if the full bivariate distribution were available [4, 61. Because calculation of the tetrachoric r involves evaluation of an infinite series, approximations must be resorted to. Of the three coefficients, only phi suffers from restriction of range. This is important when correlations are interpreted directly as a measure of relationship between two variables. It is not necessarily of importance in factor analysis where correlations are not directly interpreted. Phi has been used successfully in factor analysis, and an example of its performance relative to phi /phimax and the tetrachoric will be presented. Later we will see that phi has characteristics that would make it seem even less likely as a candidate for any kind of analysis. Coefficient Performance in Factor Analysis A preliminary evaluation of the three coefficients was made in a principal components analysis involving social, economic, and demographic data for Colorado counties. A matrix of 62 county observations on 18 variables was prepared and submitted to a principal components analysis using the SPSS package [ 51. The variables, which are continuous, were transformed to near normality to approximate the requirements of the tetrachoric coefficient (underlying bivariate normal distribution). For all but three of the variables the transformations were successful (50 percent confidence level using a chi-square test), and the exceptions do not appear to have seriously disturbed the results. With the variables still in continuous form, the above procedure produced a principal components analysis of a system of variables; the rotated factor loadings were used subsequently as a criterion against which to measure the loadings produced when the variables were converted to binary form. To convert variables to binary form, a cutoff point b must be chosen for each, with those values below b set to 0 and those above set to 1. Any two such binarized variables may then be related through a
3 Research Notes and Comments / 307 two-way table and phi, phi/phimax, and tetrachoric correlation coefficients computed. The cutoff points may differ from variable to variable and from --m to m. When they leave the range p & IT, however, the coefficients will give poor results for moderate sample sizes. In the present example b was restricted to &0.84a which, for normal distributions, will allow up to an 80 percent-20 percent split of 0 s and 1 s. A computer program was written to calculate phi, phi/phimax, and tetrachoric given the desired cutoff points. The tetrachoric coefficient was calculated using the method described by Kirk [4]. A matrix of each type of coefficient was output for each of a large number of different b values, and the matrices were then submitted to the principal components analysis. A set of rotated factor loadings was obtained in each case, to allow comparison with the corresponding loadings from the continuous variables cases. The comparison of the loadings was carried out by computing the root-mean-square (RMS) difference between the continuous and binary situations : where M is the number of loadings compared on a given factor, F is the number of factors, C, is a continuous loading, and B, is the corresponding loading from the binary case. If B, = C, for all loadings, RMS would be zero, indicating a perfect match. A large RMS indicates a poor match, but the scale is arbitrary. In the present example, the RMS values for phi/phimax and tetrachoric will be compared with those from phi. Figure 1 shows the results for a large number of runs. The runs 3.1 k Y.d a Tetrachoric VB. Phi I PhilPhinax VS. Phi Tetrachoric and PhilPhimax RHS Deviation FIG. 1. Performance of Coefficients of Binarized Data
4 308 / Geographical Analysis are distinguished from each other by the choice of cutoff points used to binarize the data. There was only one continuous case run since the choice of points for binarization does not affect the continuous case. To exclude insignificant variability, only loadings that exceeded were used in the comparisons. It will be seen in Figure 1 that the RMS error for phi is generally smaller than for either phi/phimax or the tetrachoric, despite the fact that cutoff points far from the mean-which should restrict the range of phi considerably-were chosen in many cases. Thus phi appears in this example to give good results in factor analysis. Of course one empirical.example establishes nothing. Results may vary with sample size (here N = 62) and with variable distributions. Normal marginal distributions, as effected here by transformations, do not guarantee the bivariate normality required by the tetrachoric [I]. To explore the effects of some of these parameters a simulation study was undertaken. Simulation Study A simulation in the present context offers the advantage that the population parameters can be controlled and varied at will, but suffers from the defect that a random element is introduced. Each simulation presents us with only a particular outcome and so can never establish a general case. This can be mitigated by examining a large number of outcomes for regularities, however. In this study, sets of N values of variables x and y were generated, with r,,, being the population correlation coefficient and N the sample size. The set of N x,y pairs may then be regarded as a random sample of N points from an infinite population with a correlation coefficient I,. The method of generating x and y simulated drawings from a bivariate normal distribution. In particular, x = n(2,; 0.0, 1.0) y = xr,,, + (1.0- r&)l z n(zj; 0.0, l.o), (5) (6) where z, and zj are random numbers and n(z; k, a) is a function that converts a number to a point on the normal cumulative distribution curve. x, which is generated first and then substituted into the equation for y, is a random normal variate with (population) zero mean and (population) unit variance. y is similarly distributed, and the values of x,y pairs are such as to simulate two variables correlated r,,,. Once N such x,y pairs are available, the sample correlation coefficient can be calculated. x and y can be binarized and phi, phi/phimax, and tetrachoric coefficients calculated. This process can be done repeatedly to get a sample of correlation coefficients (in the previous instance x,y pairs only were sampled whereas here correlation coefficients are sampled). N can be varied to examine the effect on the binary correlation
5 Research Notes and Comments / 309 coefficients. In addition, to see the effect of nonnormal distributions, the x,y pairs generated according to equations (5) and (6) were perturbed in twoways: first, varying amounts of uniform random noise were added in, and second, x and/or y were transformed according to 2 = ( x + C)h, (7) where h was allowed to vary up to 2.75 and c is a constant selected such that x + c > 0. In each instance, x and y were restandardized prior to calculating the correlation coefficients. Figure 2 presents the averages of sample correlation coefficients generated as above. Figure 2 contains nine panels, each containing plots of average binary coefficients as a function of sample size N. Columns of panels are differentiated by population rxy. Rows of panels are differentiated by the binarization points employed. The runs in tbe top row were binarized at the mean-b(x) = 0.0, b(y) = 0.0-and in the lower panels b(y) departs progressively from the mean. b(y) = 0.84 implies that the binarization point for y was at p u, or, with p = 0.00 and u = 1.0, at All values of y greater than 0.84 were converted to 1 and all others to 0. All phi values in the lower right-hand panel are less than rm , FIG. 2. Artificial Data: Calculated Coefficients with Varying Sample Parameters
6 310 / Geographical Analysis Each data point in Figure 2 represents the average of a number of sample binary correlation coefficients. N = 50 implies that fifty x,y pairs were used for each coefficient; sixteen such coefficients were calculated and averaged, for a total of eight hundred x,y pairs. Eight hundred were used for every data point, and thus N = 100 denotes eight coefficients of one hundred x,y pairs apiece, and so forth. Phi, phi/phimax, and tetrachoric show similar trends with Nbecause a single binarized set of data for each N was used to calculate the three coefficients. The data sets for different values of N are independent. The most striking regularity in Figure 2 is that phi systematically underestimates rry. The defect grows with increasing T,.~ and increasing departure of b(y) from the mean. It is apparent that phi is not a close estimate of the continuous product moment coefficient. Phi /phimax and tetrachoric on the other hand appear to supply reasonable estimates of this parameter, even for extreme binarization points. (Phi/phimax is absent from the top diagrams of Figs. 2-4 because phi and phi/phimax become the same in those situations.) How then can phi perform so well in factor analysis? A clue can be found in equation (8) for factor loadings. 1.b c 1.3. B OPhi b(d-o.00 ).5 mphilphiux b(y)-0.00 Aretrichoric 1. *%$ b(x)-o.oo b(ybo s. F b(d-0.m) b(yp FIG. 3. Artificial Data: Values of Fig. 2 Fiatioed by the Highest Coefficients
7 Research Notes and Comments / 311 where R, is the mxm correlation matrix and F is the matrix of factor loadings. If amatrix K,, with constant kevergere is introduced, then Kmxm Rmxm = Kmxm F- T Fp-9 K, R,,, = Kg& F,, F&, KALk, or Kmxm Rmxm = (K2:m Fmxp )(Fpxm Kn%/,",) * or (9) In other words, if all values in the correlation matrix are multiplied by k, the corresponding loadings will be changed by k1i2 but the relative values of the factors will be unchanged. Interpretation of the factors should not be altered, especially after rotation. Figure 3 compares the coefficients with the effect of a hypothetical constant k removed. In each panel, the average coefficient for rry = 0.20 and rxy = 0.50 is divided by the average for rxy = If k were not a constant-that is, if k varied with the magnitude of rx -a trend should be visible with varying rxy. This does not appear to i e so, and it can be concluded that k is very nearly a constant. There is a slight trend with b( y)-comparing with the tetrachoric-but it would appear that in a real data matrix, where the binarization points would be mixed for different variables, it should be of little consequence. This appears -~ Frc. 4. Artificial Data: Sampling Standard Deviation of Calculated Coefficients
8 312 / Geographical Analysis to be the explanation of the good performance of phi (relative to phi/phimax and the tetrachoric) in factor analysis, despite its obvious deficiencies. The evidence from Figures 2 and 3 would seem to place phi, phi/phimax, and the tetrachoric on an equal basis for use in factor analysis whereas there is some evidence, presented above, that phi performs somewhat better. Figure 4 shows why this is so. In Figure 4 the sample standard deviation of each coefficient is presented as a function of rxv, N, and b(y), as in Figure 2. It is evident that the sample variance of phi is much smaller than that of philphimax and the tetrachoric: the latter two coefficients have standard deviations 50 percent to 100 percent larger than phi,. at least for the medium-sized coefficients, which are likely to be numerous in a real correlation matrix. Conclusions The simulation runs in Figures 2 through 4 are only a small fraction of the total examined. The others used different binarization points and (slightly) nonnormal population distributions. Those presented here are representative of the total however. The sample averages and standard deviations varied as the distributions departed from normal, but the relations among the coefficients did not. It therefore appears that phi is adequate and perhaps even superior for use with binary data in factor analysis. With increasing sample size, the advantages of phi may disappear but not so as to preclude its use. Phi has two other advantages: it does not require a bivariate normal distribution as does the tetrachoric, and it is as easy to calculate in the usual factor analysis program as is the continuous product moment coefficient (the mechanics are the same). This study has not addressed the situation where the data matrix contains a mixture of continuous and binary data. It seems that the use of the product moment formula, which produces Pearson correlation coefficients for continuous data and phi coefficients at the other extreme, should turn out satisfactory coefficients in that case also, where the severe restrictions of two-valued variables have been relaxed. LITERATURE CITED 1. CARROLL, J. B. The Nature of the Data, or How to Choose a Correlation Coefficient. Ps&wm&rika, 26 (1961), GUILFORD, J. P. Fundamental Statistics in Psychology ar;d Education. 3rd ed. New York: McGraw-Hill Book Company, KAISER, HENRY F. A Second-Generation Little Jiffy. Psychometrika, 35 (December 1970), KIRK. DAVID B. On the Numerical Approximation of the Bivariate Normal (Tetrachoric) Correlation Coefficient. Psychometrika, 38 (June 1973), NIE, NORMAN H., DALE H. BENT, and C. Lmm HULL. SPSS: Statistical Package for the Social Sciences. New York: McGraw-Hill Book Company, F EARSON, K. I. Mathematical Contribution to the Theory of Evolution, VII, On the
9 Research Notes and Comments / 313 Correlation of Characters Not Quantitatively Measurable. Phil. Tmns. Roy. SOC. London, 1901,1954, pp RUMMEL, R J. Applied Factor Analysis. Evanston: Northwestern University Press, WALKER, HELEN M., and JOSEPH LEV. StaHsticd Inference. New York: Holt, Rnehart and Winston, 1953.
Upon completion of this chapter, you should be able to:
1 Chaptter 7:: CORRELATIION Upon completion of this chapter, you should be able to: Explain the concept of relationship between variables Discuss the use of the statistical tests to determine correlation
More informationScaling of Variance Space
... it Dominance, Information, and Hierarchical Scaling of Variance Space David J. Krus and Robert W. Ceurvorst Arizona State University A method for computation of dominance relations and for construction
More informationUNIT 4 RANK CORRELATION (Rho AND KENDALL RANK CORRELATION
UNIT 4 RANK CORRELATION (Rho AND KENDALL RANK CORRELATION Structure 4.0 Introduction 4.1 Objectives 4. Rank-Order s 4..1 Rank-order data 4.. Assumptions Underlying Pearson s r are Not Satisfied 4.3 Spearman
More informationAbout Bivariate Correlations and Linear Regression
About Bivariate Correlations and Linear Regression TABLE OF CONTENTS About Bivariate Correlations and Linear Regression... 1 What is BIVARIATE CORRELATION?... 1 What is LINEAR REGRESSION... 1 Bivariate
More informationCorrelations with Categorical Data
Maximum Likelihood Estimation of Multiple Correlations and Canonical Correlations with Categorical Data Sik-Yum Lee The Chinese University of Hong Kong Wal-Yin Poon University of California, Los Angeles
More informationAdvanced Methods for Determining the Number of Factors
Advanced Methods for Determining the Number of Factors Horn s (1965) Parallel Analysis (PA) is an adaptation of the Kaiser criterion, which uses information from random samples. The rationale underlying
More informationEstimating Coefficients in Linear Models: It Don't Make No Nevermind
Psychological Bulletin 1976, Vol. 83, No. 2. 213-217 Estimating Coefficients in Linear Models: It Don't Make No Nevermind Howard Wainer Department of Behavioral Science, University of Chicago It is proved
More informationDiscrete Simulation of Power Law Noise
Discrete Simulation of Power Law Noise Neil Ashby 1,2 1 University of Colorado, Boulder, CO 80309-0390 USA 2 National Institute of Standards and Technology, Boulder, CO 80305 USA ashby@boulder.nist.gov
More informationCan Variances of Latent Variables be Scaled in Such a Way That They Correspond to Eigenvalues?
International Journal of Statistics and Probability; Vol. 6, No. 6; November 07 ISSN 97-703 E-ISSN 97-7040 Published by Canadian Center of Science and Education Can Variances of Latent Variables be Scaled
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationX X (2) X Pr(X = x θ) (3)
Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree
More informationB. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis
B Weaver (18-Oct-2001) Factor analysis 1 Chapter 7: Factor Analysis 71 Introduction Factor analysis (FA) was developed by C Spearman It is a technique for examining the interrelationships in a set of variables
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationIncreasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University
Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton
More informationProbability and Stochastic Processes
Probability and Stochastic Processes A Friendly Introduction Electrical and Computer Engineers Third Edition Roy D. Yates Rutgers, The State University of New Jersey David J. Goodman New York University
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationFactor analysis. George Balabanis
Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average
More informationThe 3 Indeterminacies of Common Factor Analysis
The 3 Indeterminacies of Common Factor Analysis James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) The 3 Indeterminacies of Common
More informationFAQ: Linear and Multiple Regression Analysis: Coefficients
Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable
More informationJournal of Educational and Behavioral Statistics
Journal of Educational and Behavioral Statistics http://jebs.aera.net Theory of Estimation and Testing of Effect Sizes: Use in Meta-Analysis Helena Chmura Kraemer JOURNAL OF EDUCATIONAL AND BEHAVIORAL
More informationApproximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions
Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions K. Krishnamoorthy 1 and Dan Zhang University of Louisiana at Lafayette, Lafayette, LA 70504, USA SUMMARY
More informationMATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression
MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot
More informationDimensionality Assessment: Additional Methods
Dimensionality Assessment: Additional Methods In Chapter 3 we use a nonlinear factor analytic model for assessing dimensionality. In this appendix two additional approaches are presented. The first strategy
More informationDirectionally Sensitive Multivariate Statistical Process Control Methods
Directionally Sensitive Multivariate Statistical Process Control Methods Ronald D. Fricker, Jr. Naval Postgraduate School October 5, 2005 Abstract In this paper we develop two directionally sensitive statistical
More informationFACTORIZATION AND THE PRIMES
I FACTORIZATION AND THE PRIMES 1. The laws of arithmetic The object of the higher arithmetic is to discover and to establish general propositions concerning the natural numbers 1, 2, 3,... of ordinary
More informationA Threshold-Free Approach to the Study of the Structure of Binary Data
International Journal of Statistics and Probability; Vol. 2, No. 2; 2013 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education A Threshold-Free Approach to the Study of
More informationStatistics Introductory Correlation
Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationKDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V
KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION Unit : I - V Unit I: Syllabus Probability and its types Theorems on Probability Law Decision Theory Decision Environment Decision Process Decision tree
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationNon-independence in Statistical Tests for Discrete Cross-species Data
J. theor. Biol. (1997) 188, 507514 Non-independence in Statistical Tests for Discrete Cross-species Data ALAN GRAFEN* AND MARK RIDLEY * St. John s College, Oxford OX1 3JP, and the Department of Zoology,
More informationINSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES
1 CHAPTER 4 MATRICES 1 INSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES 1 Matrices Matrices are of fundamental importance in 2-dimensional and 3-dimensional graphics programming
More informationCH 37 DOUBLE DISTRIBUTING
CH 37 DOUBLE DISTRIBUTING 343 The Double Distributive Property W hat we need now is a way to multiply two binomials together, a skill absolutely necessary for success in this class. For example, how do
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationSC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)
SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,
More informationEQUATIONS OF EQUILIBRIUM & TWO- AND THREE-FORCE MEMEBERS
EQUATIONS OF EQUILIBRIUM & TWO- AND THREE-FORCE MEMEBERS Today s Objectives: Students will be able to: a) Apply equations of equilibrium to solve for unknowns, and b) Recognize two-force members. In-Class
More informationA Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model
A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting
More informationCHAPTER 7 MULTI-LEVEL GATE CIRCUITS NAND AND NOR GATES
CHAPTER 7 MULTI-LEVEL GATE CIRCUITS NAND AND NOR GATES This chapter in the book includes: Objectives Study Guide 7.1 Multi-Level Gate Circuits 7.2 NAND and NOR Gates 7.3 Design of Two-Level Circuits Using
More informationMultivariate Distribution Models
Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is
More informationQuantifying Weather Risk Analysis
Quantifying Weather Risk Analysis Now that an index has been selected and calibrated, it can be used to conduct a more thorough risk analysis. The objective of such a risk analysis is to gain a better
More informationHandout #6 INTRODUCTION TO ALGEBRAIC STRUCTURES: Prof. Moseley AN ALGEBRAIC FIELD
Handout #6 INTRODUCTION TO ALGEBRAIC STRUCTURES: Prof. Moseley Chap. 2 AN ALGEBRAIC FIELD To introduce the notion of an abstract algebraic structure we consider (algebraic) fields. (These should not to
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationLinear Algebra. Linear Equations and Matrices. Copyright 2005, W.R. Winfrey
Copyright 2005, W.R. Winfrey Topics Preliminaries Systems of Linear Equations Matrices Algebraic Properties of Matrix Operations Special Types of Matrices and Partitioned Matrices Matrix Transformations
More informationTruss Structures: The Direct Stiffness Method
. Truss Structures: The Companies, CHAPTER Truss Structures: The Direct Stiffness Method. INTRODUCTION The simple line elements discussed in Chapter introduced the concepts of nodes, nodal displacements,
More informationCS100: DISCRETE STRUCTURES. Lecture 3 Matrices Ch 3 Pages:
CS100: DISCRETE STRUCTURES Lecture 3 Matrices Ch 3 Pages: 246-262 Matrices 2 Introduction DEFINITION 1: A matrix is a rectangular array of numbers. A matrix with m rows and n columns is called an m x n
More informationSTATISTICS; An Introductory Analysis. 2nd hidition TARO YAMANE NEW YORK UNIVERSITY A HARPER INTERNATIONAL EDITION
2nd hidition TARO YAMANE NEW YORK UNIVERSITY STATISTICS; An Introductory Analysis A HARPER INTERNATIONAL EDITION jointly published by HARPER & ROW, NEW YORK, EVANSTON & LONDON AND JOHN WEATHERHILL, INC.,
More informationA Rothschild-Stiglitz approach to Bayesian persuasion
A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random
More informationQuadratic Equations. All types, factorising, equation, completing the square. 165 minutes. 151 marks. Page 1 of 53
Quadratic Equations All types, factorising, equation, completing the square 165 minutes 151 marks Page 1 of 53 Q1. (a) Factorise x 2 + 5x 24 Answer... (2) (b) Solve x 2 + 5x 24 = 0 Answer... (1) (Total
More informationIntroduction to Statistics and Error Analysis
Introduction to Statistics and Error Analysis Physics116C, 4/3/06 D. Pellett References: Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson Particle Data Group notes
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationProbability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015
Fall 2015 Population versus Sample Population: data for every possible relevant case Sample: a subset of cases that is drawn from an underlying population Inference Parameters and Statistics A parameter
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationOn Certain Indices for Ordinal Data with Unequally Weighted Classes
Quality & Quantity (2005) 39:515 536 Springer 2005 DOI 10.1007/s11135-005-1611-6 On Certain Indices for Ordinal Data with Unequally Weighted Classes M. PERAKIS, P. E. MARAVELAKIS, S. PSARAKIS, E. XEKALAKI
More informationA process capability index for discrete processes
Journal of Statistical Computation and Simulation Vol. 75, No. 3, March 2005, 175 187 A process capability index for discrete processes MICHAEL PERAKIS and EVDOKIA XEKALAKI* Department of Statistics, Athens
More informationREDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN
PSYCHOMETRIKA-VOL. 42, NO, 2 JUNE, 1977 REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN A component method is presented maximizing
More informationIE 361 Exam 3 (Form A)
December 15, 005 IE 361 Exam 3 (Form A) Prof. Vardeman This exam consists of 0 multiple choice questions. Write (in pencil) the letter for the single best response for each question in the corresponding
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationOn prediction and density estimation Peter McCullagh University of Chicago December 2004
On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating
More informationSOME BASICS OF TIME-SERIES ANALYSIS
SOME BASICS OF TIME-SERIES ANALYSIS John E. Floyd University of Toronto December 8, 26 An excellent place to learn about time series analysis is from Walter Enders textbook. For a basic understanding of
More informationExploratory Factor Analysis and Principal Component Analysis
Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and
More informationAppendix A: Matrices
Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows
More informationCHAPTER 3. THE IMPERFECT CUMULATIVE SCALE
CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationInverse Sampling for McNemar s Test
International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test
More informationConcept of Reliability
Concept of Reliability 1 The concept of reliability is of the consistency or precision of a measure Weight example Reliability varies along a continuum, measures are reliable to a greater or lesser extent
More informationIncompatibility Paradoxes
Chapter 22 Incompatibility Paradoxes 22.1 Simultaneous Values There is never any difficulty in supposing that a classical mechanical system possesses, at a particular instant of time, precise values of
More informationPatterns in Offender Distance Decay and the Geographic Profiling Problem.
Patterns in Offender Distance Decay and the Geographic Profiling Problem. Mike O Leary Towson University 2010 Fall Western Section Meeting Los Angeles, CA October 9-10, 2010 Mike O Leary (Towson University)
More informationFACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING
FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT
More informationCalculation and Application of MOPITT Averaging Kernels
Calculation and Application of MOPITT Averaging Kernels Merritt N. Deeter Atmospheric Chemistry Division National Center for Atmospheric Research Boulder, Colorado 80307 July, 2002 I. Introduction Retrieval
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationSome Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression
Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June
More informationDimensionality Reduction Techniques (DRT)
Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,
More informationLinear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.
POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems
More informationEstimation of Parameters
CHAPTER Probability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS Second Edition A. J. Clark School of Engineering Department of Civil and Environmental
More information2. Matrix Algebra and Random Vectors
2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns
More informationCHAPTER 5 ANALYSIS OF STRUCTURES. Expected Outcome:
CHAPTER ANALYSIS O STRUCTURES Expected Outcome: Able to analyze the equilibrium of structures made of several connected parts, using the concept of the equilibrium of a particle or of a rigid body, in
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationJUST THE MATHS UNIT NUMBER 1.5. ALGEBRA 5 (Manipulation of algebraic expressions) A.J.Hobson
JUST THE MATHS UNIT NUMBER 1.5 ALGEBRA 5 (Manipulation of algebraic expressions) by A.J.Hobson 1.5.1 Simplification of expressions 1.5.2 Factorisation 1.5.3 Completing the square in a quadratic expression
More informationLecture 6 Positive Definite Matrices
Linear Algebra Lecture 6 Positive Definite Matrices Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Spring 2017 2017/6/8 Lecture 6: Positive Definite Matrices
More informationACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin
ACE 562 Fall 2005 Lecture 2: Probability, Random Variables and Distributions Required Readings: by Professor Scott H. Irwin Griffiths, Hill and Judge. Some Basic Ideas: Statistical Concepts for Economists,
More informationChapter. Algebra techniques. Syllabus Content A Basic Mathematics 10% Basic algebraic techniques and the solution of equations.
Chapter 2 Algebra techniques Syllabus Content A Basic Mathematics 10% Basic algebraic techniques and the solution of equations. Page 1 2.1 What is algebra? In order to extend the usefulness of mathematical
More informationDiscrete Distributions
Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have
More informationALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers
ALGEBRA CHRISTIAN REMLING 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers by Z = {..., 2, 1, 0, 1,...}. Given a, b Z, we write a b if b = ac for some
More informationInvestigation into the use of confidence indicators with calibration
WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings
More informationAlgebraic Expressions and Identities
ALGEBRAIC EXPRESSIONS AND IDENTITIES 137 Algebraic Expressions and Identities CHAPTER 9 9.1 What are Expressions? In earlier classes, we have already become familiar with what algebraic expressions (or
More informationNesting and Equivalence Testing
Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra
More informationNAG Library Chapter Introduction. G08 Nonparametric Statistics
NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types
More informationTOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS
T E C H N I C A L B R I E F TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS Dr. Martin Miller, Author Chief Scientist, LeCroy Corporation January 27, 2005 The determination of total
More informationLinear Algebra. Chapter Linear Equations
Chapter 3 Linear Algebra Dixit algorizmi. Or, So said al-khwarizmi, being the opening words of a 12 th century Latin translation of a work on arithmetic by al-khwarizmi (ca. 78 84). 3.1 Linear Equations
More informationFACTOR ANALYSIS AS MATRIX DECOMPOSITION 1. INTRODUCTION
FACTOR ANALYSIS AS MATRIX DECOMPOSITION JAN DE LEEUW ABSTRACT. Meet the abstract. This is the abstract. 1. INTRODUCTION Suppose we have n measurements on each of taking m variables. Collect these measurements
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number
More informationMatrix Algebra. Matrix Algebra. Chapter 8 - S&B
Chapter 8 - S&B Algebraic operations Matrix: The size of a matrix is indicated by the number of its rows and the number of its columns. A matrix with k rows and n columns is called a k n matrix. The number
More informationPower Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions
Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions Roger L. Berger Department of Statistics North Carolina State University Raleigh, NC 27695-8203 June 29, 1994 Institute
More informationHo Chi Minh City University of Technology Faculty of Civil Engineering Department of Water Resources Engineering & Management
Lecturer: Associ. Prof. Dr. NGUYỄN Thống E-mail: nguyenthong@hcmut.edu.vn or nthong56@yahoo.fr Web: http://www4.hcmut.edu.vn/~nguyenthong/index 4/5/2016 1 Tél. (08) 38 691 592-098 99 66 719 CONTENTS Chapter
More information