Sparse orthogonal factor analysis

Size: px
Start display at page:

Download "Sparse orthogonal factor analysis"

Transcription

1 Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In the procedure, an alternating least squares algorithm is used for estimating parameters for a specified sparseness of loadings and the suitable sparseness is selected by an information criterion. It is of worth to note that the proposed procedure constrains the sparseness without using a penalty function. Key word: factor analysis, sparse loading matrix, direct sparseness constraint 1 Introduction Factor analysis (FA) is classified as exploratory (EFA) or confirmatory (CFA). In EFA, the factor loading matrix is unconstrained and has rotational freedom which is exploited to rotate the matrix so that it approximates a matrix with zero elements. In CFA, some loadings are constrained to be zero and the loading matrix has no rotational freedom (Mulaik, 2010). One refers to a loading matrix including zero elements as its being sparse, which is the property indispensable for loadings to be interpretable. In EFA, a loading matrix is rotated toward a sparse matrix, but the literal sparseness is not attained, since rotated loadings cannot exactly be equal to zero. On the other hand, some loadings are fixed exactly to zero in CFA. However, the problem in CFA is that the number of zero loadings and their locations must be chosen by users in subjective manners. In order to overcome the above difficulties, we propose a new FA procedure, which is neither EFA nor CFA, for estimating the optimal orthogonal factor solution with a sparse loading matrix that has a suitable number of zero elements, whose locations are also estimated computationally. The procedure to be proposed consists of the following two stages: Kohei Adachi, Graduate School of Human Sciences, Osaka University, Japan; adachi@hus.osaka-u.ac.jp Nickolay, T. Trendafilov, Department of Mathmatics and Statistics, Open University, UK; Nickolay.Trendafilov@open.ac.uk

2 2 KoheiAdachi and Nickolay T. Trendafilov [A] The optimal solution is obtained for a specified number of zero loadings. [B] The optimal number of zero loadings is selected among possible numbers. Stages [A] and [B] would be described in Sections 2-3 and Section 4, respectively. In the area of principal component analysis (PCA), many procedures, called sparse PCA, have been proposed in the last decade, e.g. Jolliffe, Trendafilov & Uddin, 2003; Zou, Hastie, & Tibshirani, As in our FA procedure, they obtain sparse loadings. However, besides the difference between PCA and FA, our approach does not rely on penalty functions, which is the standard way to induce sparseness in the existing sparse PCA. 2 Sparse Factor Problem The main goal of FA is estimate the p-variables m-factors matrix containing loadings and the p p diagonal matrix 2 including unique variances from the n- observation p-variables (n > p) column-centred data matrix X. For this goal, FA can be formulated with some different loss functions, among which we choose the function f(f, U,, ) = X (F+U) 2 = X ZB 2 (1) recently presented by de Leeuw (2004), Unkel and Trendafilov (2010), and Trendafilov and Unkel (2011). Here, B = [,] is a p (m + p) block matrix and Z = [F, U] is the n (m + p) one containing the common and unique factor matrices expressed as F (n m) and U (n p), respectively. The factor score matrix Z is constrained to satisfy n 1 ZZ = I m+p. (2) with I m+p the identity matrix of order m + p. We propose to minimize (1) over F, U,, and subject to (2) and SP() = q, (3) where SP() expresses the sparseness of, i.e., the number of its elements being zero, and q is a specified integer. The reason for our choosing loss function (1) is that it can be rewritten as f(f, U,, ) = X (FA+U) 2 + n n 1 XF 2 and be easily minimized over subject to (3) as seen in the next section. (1) 3 Algorithm For minimizing (1) subject to (2) and (3), we consider alternately iterating the update of each parameter matrix. First, let us consider updating so that (1) or (1) is minimized subject to (3) while Z = [F, U] and are kept fixed. Such optimal update of = ( jk ) is given by 0 iff a jk a jk q, (4) a jk otherwise

3 Sparse orthogonal factor analysis 3 where a jk is the (j, k) element of A = (a jk ) = n 1 XF (5) and a q is the q-th smallest absolute value among those of the elements in A. Next, let us consider updating diagonal. We can find (1) is minimized for = diag(n 1 XU), (6) when Z = [F, U] and are fixed. Finally, let us consider updating Z = [F, U] so that (1) is minimized subject to (2) with and kept fixed. Since (1) is rewritten as trxx +ntrbb 2tr(XB)Z using (2), its minimum is found to be attained for Z = n 1/2 PQ = n 1/2 P 1 Q 1 + n 1/2 P 2 Q 2, (7) with P = [P 1, P 2 ] and Q = [Q 1, Q 2 ] obtained through the singular value decomposition (SVD) of the n (m+ p) matrix n 1/2 XB; n 1/2 1 XB = PQ = [P 1,P 2 ] Q 1 mom = P1 Q 1 Q 1, (8) 2 with m O m the m m matrix of zeros. Here, rank(xb) = p is assumed, expresses the diagonal matrix of order m + p with its first p p diagonal block being positive definite matrix 1, and P and Q satisfy PP = QQ = QQ = I m+ p with P 1 and Q 1 being n p and m p matrices, respectively. Although (7) and (8) show that Z cannot be uniquely determined, the p-variables (m+p)-factors covariance matrix n 1 XZ = [n 1 XF, n 1 XU] = [A, n 1 XU] used for updates (4) and (6) is given uniquely by n 1 XZ = (n 1/2 X)(n 1/2 Z) = (B + Q 1 1 P 1 )(PQ) = B + Q 1 1 Q 1. (9) This equality follows from that the Moore-Penrose inverse of B is given by B + = B(BB) 1 B, since rank(xb) = p implies B being of full-row rank: the use of BB + = I p in (8) leads to n 1/2 X = P 1 1 Q 1 B +, which is transposed and post-multiplied by (7) to give (9) (Adachi, 2012). Comparing (9) with (5) and (6), we find that they are rewritten as A = B + Q 1 1 Q 1 H m, = diag(b + Q 1 1 Q 1 H p ), using H m = [I m, m O p ] and H p = [ p O m, I p ] with m O p the m p matrix of zeros. Here, we should distinguish between on the left hand side of (6) and its counterpart in B = [,] on the right hand side. The former is the updated one, while the latter one is from the previous iteration. The above equations show that and can be updated without obtaining Z, only if sample covariance matrix S = n 1 XX is available, and even if the original data matrix X is given. That is, (8) shows that the eigenvalue decomposition (EVD) BSB = Q Q 1 gives the matrices Q 1 and 1 needed in (5) and (6), with (5) being used for (4). Further, the resulting loss function value can be computed without the use of X: substituting (2), (5) and (6) into an expanded form of loss function (1), we can rewrite it as f(, ) = ntrs + ntr 2ntrA ntr 2. Further, (1) can be simplified into f(b) = n{trs tr(+ 2 )} = n(trs trbb), by noting that (4) implies tra = tr. Then, the standardized loss function (5) (6) f S (B) = 1 trbb/trs, (10) which takes a value within [0,1], can be used for convenience instead of f(, ).

4 4 KoheiAdachi and Nickolay T. Trendafilov The optimal solution with sparseness (2) is thus given by the following algorithm: Step 1. Initialize B = [,]. Step 2. Perform EVD of BSB. Step 3. Update with (6). Step 4. Obtain A with (5). Step 5. Update with (4). Step 6. Finish if convergence is reached; otherwise, go back to Step 2. To avoid missing the global minimum, we run the algorithm multiple times with different random initialization of B in Step 1, and the optimal one is selected via a procedure described in Section 5. We denote the resulting solution of B as [ ˆ q, ˆ q ], where the subscript q indicates the particular number of zeros used in (3). Bˆ q = 4 Sparseness Selection Sparseness can be restated as parsimony: the greater SP() implies that fewer parameters are to be estimated and the resulting loss function value is greater. Thus, the sparseness selection means to choose a FA model with the optimal combination of attained loss function value and parsimony. For such model selection, we can use the information criteria (Schwarz, 1978) which are defined using maximum likelihood (ML) estimates. Although a maximum likelihood method is not used in our algorithm, we assume that Bˆ q = [ ˆ q, ˆ q ] is equivalent to the maximum likelihood FA solution which maximizes log likelihood L(,) = 0.5n{log+ 2 + trs(+ 2 ) 1 } with the locations of the zero loadings constrained to be those of ˆ q. Under this assumption, we propose to use an information criterion BIC (Schwarz, 1978) for choosing the optimal q. BIC can be expressed as for Bˆ q BIC(q) = 2 L( ˆ q, ˆ q) q log n + c * (11) with c * a constant irrelevant to q. The optimal sparseness is thus defined as qˆ argmin BICq ( ) (12) qminqqmax and ˆB qˆ (i.e., Bˆ q with q = qˆ ) is chosen as the final solution Bˆ, wtih q min = m(m1)/2 and q max = p(m1). 5 Simulation Study We performed a simulation study to assess the proposed procedure with respect to exactness in identifying the true sparseness and locations of zero loadings, goodness of the recovery of parameter values, and sensitivity to local minima.

5 Sparse orthogonal factor analysis 5 Figure 1: Three loading matrices of simple structure (left) and two ones of bi-factor structure (right). # r # r r # r r # # # # # r # r # r r # # # # # # r # r r # # # # # r # r r # r # r r # r # r r # r # r # r r r # r # r # r # r # # # r # r # r # # # # # r # r # r # # # # r r # r r # # # # r r # r r # # r # r r r # r r # # r # # r # r r # # r # # r r # # # blank: 0 r r # # r # : non-zero r r # # r r r : 0 or non-zero r r # We used the five types of the true shown in Figure 1. For each type, we generated 40 sets of {,, S} by the following steps: 1) Each diagonal element of was set to u(0.1 1/2, 0.7 1/2 ). 2) A non-zero value in was set to u(0.4, 1), while an element denoted by r in Figure 1 was randomly set to zero or u(0.4, 1). 3) was normalized so as to satisfy diag(+ 2 ) = I p. 4) Setting n = 200p, we sampled each row of X from the centred p-variate normal distribution with its covariance matrix ) Intervariable correlation matrix S was obtained from X. Here, u(, ) denotes a value drawn from the uniform distribution of the range [, ]. The procedures described in Sections 2, 3 and 4 were applied to the resulting 200 (= 40 5) S, where the algorithm in Section 3 was run multiple times with a twooptimal-solutions stopping procedure. Using Bˆ ql for the solution resulting of B from the l-th run, the procedure is listed as follows: Phase 1. Set L q = 50 and obtain Bˆ ql for l = 1,, L q ; find l * = arg min 1l L fs( ˆ q Bql ) and set Bˆ q = B ˆ ql *. Phase 2. Finish, if Bˆ q is equivalent to B ~ q resulting from the ~ l -th run with ~ l l * ; otherwise, go to Phase 3. Phase 3. Set L q := L q + 1, and let B ~ q be the output from another run. Phase 4. Exchange Bˆ q for B ~ q if f S ( B ~ q ) < f S ( Bˆ q ). Phase 5. Finish if Bˆ q = q Here, the equivalence of B ~ or L q = 200; otherwise, go back to Phase 3. Bˆ q = [ ˆ q, ˆ q ] and B ~ q = [ ~ q, ~ q ] is defined as 2 1 ( ˆ q ~ q 1 /mp + ˆ q 1 p ~ q 1 p 1 /p) is less than 10 3, where 1 denotes the sum of the absolute values of the elements of the argument and 1 p is the p 1 vector of ones. Except Bˆ q and B ~ q, the rest L q 2 solutions are local minimizers. Clearly, the L q value indicates the sensitivity of the algorithm to local minima. We obtained the average of L q values over all runs for each data set. As a result, the quartiles of the averages over 200 data sets are 89, 120, and 155, which demonstrates high sensitivity to local minima. Nevertheless, good performances of the proposed procedure are shown next. Table 1 shows the distributions of the indices measuring the correctness of qˆ and Bˆ (over 200 data sets). The percentiles of BES = ( q ˆ q) / q, which assesses the relative bias of the estimated sparseness from the true q, show that sparseness was

6 6 KoheiAdachi and Nickolay T. Trendafilov Table 1: Distributions of indices for correctness of estimated sparseness and parameters Percentile BSE Identification Rate Difference R 00 R ## satisfactorily estimated, though it tended to be underestimated. Indices R 00 and R ## are the rates of the zero and non-zero elements in the true correctly identified by ˆ. Non-zero elements is found to have been exactly identified in Table 2. The fourth and 2 fifth indices are mean absolute differences ˆ 1 /(pm) and ˆ 1 p 2 1 p 1, whose percentiles show that the parameter values were recovered very well. 6 Conclusions In order to overcoming the difficulties with EFA and CFA, we proposed a new FA procedure in which the optimal solution is estimated subject to the direct sparseness constraint on loadings and the best sparseness is selected using BIC. The simulation study demonstrated that the true sparseness and parameter values are recovered well in the procedure. References 1. Adachi, K.: Some contributions to data-fitting factor analysis with empirical comparisons to covariance-fitting factor analysis. J. Japan. Soc. Comp. Stat., 25, (2012). 2. de Leeuw, J.: Least squares optimal scaling of partially observed linear systems. In:. van Montfort, K., Oud, J., Satorra, A. (Eds.), Recent Developments of Structural Equation Models: Theory and Applications. pp Kluwer Academic Publishers, Dordrecht (2004). 3. Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the LASSO. J. Comp. Graph. Stat., 12, (2003). 4. Trendafilov, N.T., Unkel, S.: Exploratory factor analysis of data matrices with more variables than observations. J. Comp. Graph. Stat., 20, (2011). 5. Unkel, S., Trendafilov, N.T.: Simultaneous parameter estimation in exploratory factor analysis: An expository review. International Stat. Review, 78, (2010). 6. Mulaik, S. A.: Foundations of Factor Analysis, Second Edition. CRC Press, Boca Raton (2010). 7. Schwarz, G.: Estimating the dimension of a model. Ann. Stat., 6, (1978). 8. Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comp. Graph. Stat., 15, (2006).

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Penalized varimax. Abstract

Penalized varimax. Abstract Penalized varimax 1 Penalized varimax Nickolay T. Trendafilov and Doyo Gragn Department of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK Abstract A common weakness

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

LEAST SQUARES METHODS FOR FACTOR ANALYSIS. 1. Introduction

LEAST SQUARES METHODS FOR FACTOR ANALYSIS. 1. Introduction LEAST SQUARES METHODS FOR FACTOR ANALYSIS JAN DE LEEUW AND JIA CHEN Abstract. Meet the abstract. This is the abstract. 1. Introduction Suppose we have n measurements on each of m variables. Collect these

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

8. Diagonalization.

8. Diagonalization. 8. Diagonalization 8.1. Matrix Representations of Linear Transformations Matrix of A Linear Operator with Respect to A Basis We know that every linear transformation T: R n R m has an associated standard

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Pollution Sources Detection via Principal Component Analysis and Rotation

Pollution Sources Detection via Principal Component Analysis and Rotation Pollution Sources Detection via Principal Component Analysis and Rotation Vanessa Kuentz 1 in collaboration with : Marie Chavent 1 Hervé Guégan 2 Brigitte Patouille 1 Jérôme Saracco 1,3 1 IMB, Université

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Lecture VIII Dim. Reduction (I)

Lecture VIII Dim. Reduction (I) Lecture VIII Dim. Reduction (I) Contents: Subset Selection & Shrinkage Ridge regression, Lasso PCA, PCR, PLS Lecture VIII: MLSC - Dr. Sethu Viayakumar Data From Human Movement Measure arm movement and

More information

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12 STATS 306B: Unsupervised Learning Spring 2014 Lecture 13 May 12 Lecturer: Lester Mackey Scribe: Jessy Hwang, Minzhe Wang 13.1 Canonical correlation analysis 13.1.1 Recap CCA is a linear dimensionality

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Generalized Concomitant Multi-Task Lasso for sparse multimodal regression

Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Mathurin Massias https://mathurinm.github.io INRIA Saclay Joint work with: Olivier Fercoq (Télécom ParisTech) Alexandre Gramfort

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Reconciling factor-based and composite-based approaches to structural equation modeling

Reconciling factor-based and composite-based approaches to structural equation modeling Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle elecomunicazioni A.A. 015-016 Pietro Guccione, PhD DEI - DIPARIMENO DI INGEGNERIA ELERICA E DELL INFORMAZIONE POLIECNICO DI BARI Pietro

More information

LOWELL WEEKLY JOURNAL. ^Jberxy and (Jmott Oao M d Ccmsparftble. %m >ai ruv GEEAT INDUSTRIES

LOWELL WEEKLY JOURNAL. ^Jberxy and (Jmott Oao M d Ccmsparftble. %m >ai ruv GEEAT INDUSTRIES ? (») /»» 9 F ( ) / ) /»F»»»»»# F??»»» Q ( ( »»» < 3»» /» > > } > Q ( Q > Z F 5

More information

Zig-zag exploratory factor analysis with more variables than observations

Zig-zag exploratory factor analysis with more variables than observations Noname manuscript No. (will be inserted by the editor) Zig-zag exploratory factor analysis with more variables than observations Steffen Unkel Nickolay T. Trendafilov Received: date / Accepted: date Abstract

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

FaST linear mixed models for genome-wide association studies

FaST linear mixed models for genome-wide association studies Nature Methods FaS linear mixed models for genome-wide association studies Christoph Lippert, Jennifer Listgarten, Ying Liu, Carl M Kadie, Robert I Davidson & David Heckerman Supplementary Figure Supplementary

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics 10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Introduction to the genlasso package

Introduction to the genlasso package Introduction to the genlasso package Taylor B. Arnold, Ryan Tibshirani Abstract We present a short tutorial and introduction to using the R package genlasso, which is used for computing the solution path

More information

IV. Matrix Approximation using Least-Squares

IV. Matrix Approximation using Least-Squares IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references

More information

LINEAR SYSTEMS (11) Intensive Computation

LINEAR SYSTEMS (11) Intensive Computation LINEAR SYSTEMS () Intensive Computation 27-8 prof. Annalisa Massini Viviana Arrigoni EXACT METHODS:. GAUSSIAN ELIMINATION. 2. CHOLESKY DECOMPOSITION. ITERATIVE METHODS:. JACOBI. 2. GAUSS-SEIDEL 2 CHOLESKY

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

Number of cases (objects) Number of variables Number of dimensions. n-vector with categorical observations

Number of cases (objects) Number of variables Number of dimensions. n-vector with categorical observations PRINCALS Notation The PRINCALS algorithm was first described in Van Rickevorsel and De Leeuw (1979) and De Leeuw and Van Rickevorsel (1980); also see Gifi (1981, 1985). Characteristic features of PRINCALS

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0. Package sgpca July 6, 2013 Type Package Title Sparse Generalized Principal Component Analysis Version 1.0 Date 2012-07-05 Author Frederick Campbell Maintainer Frederick Campbell

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 2 Often, our data can be represented by an m-by-n matrix. And this matrix can be closely approximated by the product of two matrices that share a small common dimension

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

A New Subspace Identification Method for Open and Closed Loop Data

A New Subspace Identification Method for Open and Closed Loop Data A New Subspace Identification Method for Open and Closed Loop Data Magnus Jansson July 2005 IR S3 SB 0524 IFAC World Congress 2005 ROYAL INSTITUTE OF TECHNOLOGY Department of Signals, Sensors & Systems

More information

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R, Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations

More information

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. January 5, 25 Outline Methodologies for the development of classification

More information

Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values

Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values 1MARCH 001 SCHNEIDER 853 Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values APIO SCHNEIDER Atmospheric and Oceanic Sciences Program,

More information

Introduction to Confirmatory Factor Analysis

Introduction to Confirmatory Factor Analysis Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis

Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis Anne Bernard 1,5, Hervé Abdi 2, Arthur Tenenhaus 3, Christiane Guinot 4, Gilbert Saporta

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Chapter 3. Matrices. 3.1 Matrices

Chapter 3. Matrices. 3.1 Matrices 40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information

A GENERALIZATION OF TAKANE'S ALGORITHM FOR DEDICOM. Yosmo TAKANE

A GENERALIZATION OF TAKANE'S ALGORITHM FOR DEDICOM. Yosmo TAKANE PSYCHOMETR1KA--VOL. 55, NO. 1, 151--158 MARCH 1990 A GENERALIZATION OF TAKANE'S ALGORITHM FOR DEDICOM HENK A. L. KIERS AND JOS M. F. TEN BERGE UNIVERSITY OF GRONINGEN Yosmo TAKANE MCGILL UNIVERSITY JAN

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained Components

Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained Components Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 6 11-2014 Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information