TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

Size: px
Start display at page:

Download "TAMS39 Lecture 10 Principal Component Analysis Factor Analysis"

Transcription

1 TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden

2 Content - Lecture Principal component analysis (PCA) Factor analysis (FA) TAMS39 Lecture 10 1/36

3 Principal component analysis (PCA) A principal component analysis (PCA) is concerned with explaining the variance-covariance structure of a set of variables through a few linear combinations of these variables. Its general objectives are data reduction, and interpretation. Although p components are required to reproduce the total system variability, often much of this variability can be accounted for by a small number k of the principle components. TAMS39 Lecture 10 2/36

4 Let x = (x 1,..., x p ) be a random vector with the covariance Σ. Algebraically, the principal components are particular linear combinations of the p random variables. Geometrically, these linear combinations represent the selection of a new coordinate system obtained by rotating the original system with x 1,..., x p as the coordinate axes. TAMS39 Lecture 10 3/36

5 Consider the linear combinations y 1 = a 1x,. y p = a px, with the variances and covariances var(y i ) = a iσa i, cov(y i, y k ) = a iσa k, i = 1,...p, i, k = 1,...p. The principal components are those uncorrelated linear combinations y 1,..., y p whose variance above are as large as possible. TAMS39 Lecture 10 4/36

6 First principal component = linear combination y 1 = a 1 x that maximize var(a 1 x) subject to a 1 a 1 = 1.. ith principal component = linear combination y i = a i x that maximize var(a i x) subject to a i a i = 1 and cov(a i x, a kx) = 0 for k < i, i = 1,..., p. TAMS39 Lecture 10 5/36

7 Let λ 1,..., λ p > 0 be the eigevalue of the matrix Σ and let H = (h 1,..., h p ) be an m m orthogonal matrix such that H ΣH = diag(λ 1,..., λ p ) = Λ, so that h i is a eigenvector of Σ corresponding to the eigenvalue λ i. We now have that the covariance between any linear combination a x and a linear combination based on a eigenvector h ix is given by cov(a x, h ix) = a Σh i = λ i a h i. Hence, cov(a x, h ix) = 0 is the same as a and h i to be orthogonal. TAMS39 Lecture 10 6/36

8 Theorem For k = 1,..., p λ k = max a a=1, a h i =0, i=1,...,k 1 a Σa = h kσh k. TAMS39 Lecture 10 7/36

9 TAMS39 Lecture 10 8/36

10 Measures of total variation Note that in transforming to principal components the measure trσ and Σ of total variations are unchanged, for p trσ = trh ΣH = trλ = λ i, Σ = H ΣH = Λ = i=1 p λ i. Note also that k i=1 λ i is the variance of the first k principal components. In principal component analysis the hope is that for some small k, this variance is close to trσ, i.e., the first k principle components explain most of the variation in x, and the remaining p k principal components contribute little. i=1 TAMS39 Lecture 10 9/36

11 Sample principle component analysis Assume that x 1,..., x n are iid N p (µ, Σ), i.e., X = (x 1,..., x n ) N p,n (µ1, Σ, I ). The MLE of Σ is n 1 n S where S is the sample covariance matrix given by S = 1 n 1 X ( I n 1 n (1 n1 n ) 1 1 ) n X. The MLE of the λ i s, the ordered (assumed distinct) eigenvalues of Σ, are n 1 n ˆλ i, where ˆλ i s are the ordered eigenvalues of S. The ˆλ i s are distinct with probability one, since n > p. TAMS39 Lecture 10 10/36

12 The sample principle components are defined as First sample principal component = linear combination ŷ 1 = a 1 x that maximize the sample variance a 1 Sa 1 subject to a 1 a 1 = 1.. ith smaple principal component = linear combination ŷ i = a i x that maximize the sample variance a i Sa i subject to a i a i = 1 and a i Sa k = 0 for k < i, i = 1,..., p. TAMS39 Lecture 10 11/36

13 We now have a similar theorem as above. Theorem For k = 1,..., p ˆλ k = max a Sa = a a=1, a ĥ ĥ ksĥk, i =0, i=1,...,k 1 where ĥk is a eigenvector of the MLE n 1 n S of Σ corresponding to the eigenvalue ˆλ k. TAMS39 Lecture 10 12/36

14 Asymptotic distributions Assume that x 1,..., x n are iid N p (µ, Σ). Assume also that the eigenvalues of Σ are distinct and positive, so that 0 < λ p <... < λ 1. The sampling distribution of the MLEs ˆλ i and ĥi are difficult to derive and beyond the scope of this course. We shall simply summarize some results. TAMS39 Lecture 10 13/36

15 Asymptotic mean and variance Tractable expressions for the exact moments of the eigenvalues of S are unknown, but asymptotic expressions for some of these have been found by Lawley (1956). Lawley has shown that if λ i is a distinct eigenvalue of Σ the mean and variance of ˆλ i can be expand for large n as E(ˆλ i ) = λ i + λ i n p j=1,j i λ j λ i λ j + O(n 2 ) and var(ˆλ i ) = 2λ2 i n 1 1 n p j=1,j i ( λj ) 2 + O(n 3 ). λ i λ j TAMS39 Lecture 10 14/36

16 Asymptotic distribution, cont. Let and then we have E i = λ i λ = (λ 1,..., λ p ) p k=1,k i λ k (λ k λ i ) 2 h kh k, 1. n(ˆλ λ) N p (0, 2Λ 2 ), where Λ = diag(λ 1,..., λ p ), i.e., n(ˆλi λ i ) 2λi N(0, 1) for i = 1,..., p, 2. n(ĥi h i ) N p (0, E i ), 3. ˆλ i and ĥi is independently distributed for i = 1,..., p. TAMS39 Lecture 10 15/36

17 Result 1 implies that, for large n, the ˆλ i are independently distributed. Using result 1 one can also construct confidence interval for the λ i s. Result 2 implies that the elements of each ĥi are correlated, and the correlation depends to a large extent on the separation of the eigenvalues λ 1,..., λ p (which are unknown) and the sample size n. TAMS39 Lecture 10 16/36

18 H : λ k+1 = λ k+2 =... = λ p = λ Suppose we want to test the hypothesis that the last p k eigenvalues of the covariance matrix Σ are equal to λ. That is, we want to test H : λ k+1 = λ k+2 =... = λ p = λ vs. A H, where λ is unknown. This is the so-called isotropy test for the eigenvalues. Typically one conducts a series of isotropy tests starting with k = p 2 and increasing k until the null hypothesis is accepted. TAMS39 Lecture 10 17/36

19 PCA LRT The LRT (based in normality) for the hypothesis H is based in the statistic Q = p ˆλ j=k+1 j ( 1 p ˆλ ) p k, p k j=k+1 j where ˆλ 1 ˆλ 2... ˆλ p are the eigenvalues of the sample covariance matrix S based on f = n 1 degrees of freedom. It has been shown by Lawley (1956) that Q = ( f k 1 6 ( 2(p k) p k χ 2 ( 1 2 (p k)(p k + 1) 1 ). )) ln Q Hence, H is rejected if Q > χ 2 1 α ( 1 2 (p k)(p k + 1) 1). TAMS39 Lecture 10 18/36

20 Example PCA The weekly rates of return for five stocks (JP Morgan, Citibank, Wells faro, Royal Dutch Shell, and ExxonMobil) listed in the NY Stock Exchange were determined for the period january 2004 through December The weekly rates of return redefined as current week closing price - previous week closing price, previous week closing price adjusted for stock splits and dividends. The observations in 103 successive weeks appear to be independently distributed, but the rates of return across stocks are correlated, because as on expects, stocks tend to move together in response to general economics conditions. TAMS39 Lecture 10 19/36

21 Let x 1,..., x 5 denote observed weekly rates of return for the stocks given above. Then x = (0.0011, , , , ) R = , 1 and where R is the sample correlation matrix R = D 1/2 D S = diag(s 11,..., s 55 ). S SD 1/2 S, with We note that R is the sample covariance matrix for the standardize observations z i = x i x i sii, i = 1,..., 5. TAMS39 Lecture 10 20/36

22 The eigenvalues and corresponding normalized eigenvectors of R are ˆλ 1 = 2.437, ĥ 1 = (0.469, 0.532, 0.465, 0.387, 0.361), ˆλ 2 = 1.407, ĥ 2 = ( 0.368, 0.236, 0.315, 0.585, 0.606), ˆλ 3 = 0.501, ĥ 3 = ( 0.604, 0.136, 0.772, 0.093, 0.109), ˆλ 4 = 0.400, ĥ 4 = (0.363, 0.629, 0.289, 0.381, 0.493), ˆλ 5 = 0.255, ĥ 5 = (0.384, 0.496, 0.071, 0.595, 0.498). TAMS39 Lecture 10 21/36

23 Using the standardize variables, we obtain the first two sample principle components: ŷ 1 = ĥ 1z = 0.469z z z z z 5, ŷ 2 = ĥ 2z = 0.368z z z z z 5, and these components, which account for ˆλ 1 + ˆλ 2 p = = 0.77, i.e., 77% of the total (standardize) sample variance, have interesting interpretations. TAMS39 Lecture 10 22/36

24 The first component is a roughly equally weighted sum, or index, of the five stocks. This component might be called a general stock-market component, or, simply, a market component. The second component represent a contrast between banking stocks and the oil stocks. It might be called an industry component. Thus, we see that most of the variation in these stock return is due to market activity and uncorrelated industry activity. TAMS39 Lecture 10 23/36

25 Factor analysis Closely related to PCA is factor analysis. Factor analysis is a statistical method used to study the dimensionality of a set of variables. In factor analysis, latent variables represent unobserved constructs and are referred to as factors or dimensions. The essential purpose of factor analysis is to describe, if possible, the covariance relationships among many variables in terms of a few underlying, but unobservable, random quantities called factors. TAMS39 Lecture 10 24/36

26 Factor analysis Example Suppose we wish to judge the abilities of high school students entering university. We may give them a test of 50 questions. These 50 questions, however, may fall into a few categories, such as reading comprehension, mathematics, and arts. Here we only have three factors. TAMS39 Lecture 10 25/36

27 The score of any randomly selected high school student on the ith question, denoted by y i, can be modeled in the form y i = µ i + λ i1 f 1 + λ i2 f 2 + λ i3 f 3 + ε i, i = 1,..., 50, where µ i is the mean for y i. Without loss of generality we can assume that f j iid N(0, 1) for j = 1, 2, 3, and independently distributed of the errors ε i iid N(0, ψ i ) for i = 1,..., 50. TAMS39 Lecture 10 26/36

28 Factor analysis Model In matrix notation, the model with k factors and p characteristics of a subject can be written as y = µ + Λf + ε, where y = (y 1,..., y p ), µ = (µ 1,..., µ p ), and Ψ = diag(ψ 1,..., ψ p ). ε = (ε 1,..., ε p ) N p (0, Ψ), f = (f 1,..., f k ) N k (0, I k ), λ λ 1k Λ =.. : p k, λ p1... λ pk TAMS39 Lecture 10 27/36

29 Factor analysis Covariance matrix Since, cov(f ) = I k, cov(ε) = Ψ and cov(f, ε) = 0, it follows that the covariance of y is given by cov(y) = Λ cov(f )Λ + cov(ε) = ΛΛ + Ψ Σ. Note that the value of Σ is unchanged if Λ is post-multiplied by any k k orthogonal matrix. Hence, there is no unique choice of Λ. TAMS39 Lecture 10 28/36

30 Factor analysis Number of factors The number of parameters that need to be estimated is pk for Λ and p for the diagonal elements of Ψ, totalling to p(k + 1). The number of quantities available for estimating is p(p + 1)/2 elements in the sample covariance matrix S. Thus in principle the number of factors that can be selected to represent the data should be less than or equal to (p 1)/2. We have noted that the value of Σ is unchanged if Λ is postmultiplied by any k k orthogonal matrix. Thus, the effective number of parameters is not p(k + 1), but p(k + 1) k(k 1)/2. Hence, p(k + 1) k(k 1) 2 p(p + 1) 2 k 1 2 ((2p + 1) 8p + 1). TAMS39 Lecture 10 29/36

31 Factor analysis Uniqueness It should be mentioned, however, that if k satisfies the inequality, it does not necessary imply that a solution exist, let alone uniqueness. The inequality above has been arrived by requiring that the number of unknowns should be less than or equal to the number of equations available. To get a unique solution, we not only require that k satisfies the inequality but also that the k k matrix Λ Ψ 1 Λ is a diagonal matrix with diagonal elements that are ordered from largest to smallest (see Lawley and Maxwell, 1970). TAMS39 Lecture 10 30/36

32 Factor analysis MLE The estimate cannot be obtained explicitly, and iterative methods have to be used. To obtain the unique ML solution, it follows that when factor analysis is carried out on the correlation matrix R = D 1/2 S SD 1/2 S, where D S = diag(s 11,..., s pp ), we need to solve the equations k ˆλ 2 ij + ˆψ i = 1, i = 1,..., p, j=1 ( ) ( Ψ 1/2 R Ψ 1/2 Λ) Ψ 1/2 = Ψ 1/2 Λ D, where D is the diagonal matrix D = I + D and D is the diagonal matrix D = Λ Ψ 1 Λ (uniqueness condition). TAMS39 Lecture 10 31/36

33 Factor analysis Choosing the number of factors In factor analysis, we seek a diagonal matrix, Ψ with positive diagonal elements such that Σ Ψ is a positive semidefinite matrix of rank k. It can be shown that such a k will always be larger than the number of eigenvalues of the population correlation matrix that are greater than one (see Guttman, 1954). Since the population correlation matrix can be estimated by the sample correlation matrix, R, a rule of thumb often used in statistical packages chooses k to be the number of eigenvalues of R greater than 1. This choice of k can be used as an initial guess for the number of factors. TAMS39 Lecture 10 32/36

34 With this choice of k, we may test its adequacy by testing the hypothesis H : Σ = ΛΛ + Ψ vs. A : Σ ΛΛ + Ψ, where Λ is a p k matrix. The MLE of Σ under the alternative is S and under H with k factors is given by Σ k = Λ k Λ k + Ψ, where Λ k and Ψ are the MLEs for k factors. TAMS39 Lecture 10 33/36

35 Factor analysis LRT One can show that an asymptotic test statistic, based on the LRT, is given by ( ) S 2 ln χ 2 (g), Σ where g = 1 2 ((p k)2 (p + k)). However, Bartlett (1954) suggested replacing 2 in the above expression by n (2p + 4k + 11)/6 to get a better approximation. This factor is known as Bartlett s correction. The hypothesis H is rejected if ( n ) 2p + 4k + 11 ln 6 ( ) S > χ 2 1 α(g). Σ TAMS39 Lecture 10 34/36

36 Example Factor analysis The stock-price dat above is analyzed assuming an k = 2 factor model and using the ML method. Maximum likelihood Estimated factor loadings Specific variances Variable ˆλ1 ˆλ2 ˆψi (= 1 ĥ2 i ) 1. JP Morgan Citibank Wells Fargo Royal Dutch Shell Texaco TAMS39 Lecture 10 35/36

37 TAMS39 Lecture 10 36/36

38 Linköping University - Research that makes a difference

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

3.1. The probabilistic view of the principal component analysis.

3.1. The probabilistic view of the principal component analysis. 301 Chapter 3 Principal Components and Statistical Factor Models This chapter of introduces the principal component analysis (PCA), briefly reviews statistical factor models PCA is among the most popular

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II the Contents the the the Independence The independence between variables x and y can be tested using.

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Factor Analysis and Kalman Filtering (11/2/04)

Factor Analysis and Kalman Filtering (11/2/04) CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Capítulo 12 FACTOR ANALYSIS 12.1 INTRODUCTION

Capítulo 12 FACTOR ANALYSIS 12.1 INTRODUCTION Capítulo 12 FACTOR ANALYSIS Charles Spearman (1863-1945) British psychologist. Spearman was an officer in the British Army in India and upon his return at the age of 40, and influenced by the work of Galton,

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

Factor Analysis Edpsy/Soc 584 & Psych 594

Factor Analysis Edpsy/Soc 584 & Psych 594 Factor Analysis Edpsy/Soc 584 & Psych 594 Carolyn J. Anderson University of Illinois, Urbana-Champaign April 29, 2009 1 / 52 Rotation Assessing Fit to Data (one common factor model) common factors Assessment

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Factor analysis. George Balabanis

Factor analysis. George Balabanis Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average

More information

Dimension Reduction and Classification Using PCA and Factor. Overview

Dimension Reduction and Classification Using PCA and Factor. Overview Dimension Reduction and Classification Using PCA and - A Short Overview Laboratory for Interdisciplinary Statistical Analysis Department of Statistics Virginia Tech http://www.stat.vt.edu/consult/ March

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Principal Analysis Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 4: Factor analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering Pedro

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 11 November 2, 2005 Multivariate Analysis Lecture #11-11/2/2005 Slide 1 of 58 Today s Lecture Factor Analysis. Today s Lecture Exploratory factor analysis (EFA). Confirmatory

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

R = µ + Bf Arbitrage Pricing Model, APM

R = µ + Bf Arbitrage Pricing Model, APM 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

STA 437: Applied Multivariate Statistics

STA 437: Applied Multivariate Statistics Al Nosedal. University of Toronto. Winter 2015 1 Chapter 5. Tests on One or Two Mean Vectors If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition Chapter 5. Tests

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R, Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM. 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test la Contents The two sample t-test generalizes into Analysis of Variance. In analysis of variance ANOVA the population consists

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Factor Analysis Continued. Psy 524 Ainsworth

Factor Analysis Continued. Psy 524 Ainsworth Factor Analysis Continued Psy 524 Ainsworth Equations Extraction Principal Axis Factoring Variables Skiers Cost Lift Depth Powder S1 32 64 65 67 S2 61 37 62 65 S3 59 40 45 43 S4 36 62 34 35 S5 62 46 43

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Lecture 6: April 19, 2002

Lecture 6: April 19, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models Spring 2002 Lecturer: Jeff Bilmes Lecture 6: April 19, 2002 University of Washington Dept. of Electrical Engineering Scribe: Huaning Niu,Özgür Çetin

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Stat 206, Week 6: Factor Analysis

Stat 206, Week 6: Factor Analysis Stat 206, Week 6: Factor Analysis James Johndrow 2016-09-24 Introduction Factor analysis aims to explain correlation between a large set (p) of variables in terms of a smaller number (m) of underlying

More information

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is Example X~N p (µ,σ); H 0 : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) H 0β : β µ = 0 test statistic for H 0β is y z = β βσβ /n And reject H 0β if z β > c [suitable critical value] 301 Reject H 0 if

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis Multivariate Fundamentals: Rotation Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA Analysis with Spatial Data Proportion of variance explained Comp.1 + Comp.2

More information

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9 The Common Factor Model Measurement Methods Lecture 15 Chapter 9 Today s Class Common Factor Model Multiple factors with a single test ML Estimation Methods New fit indices because of ML Estimation method

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Comparing two independent samples

Comparing two independent samples In many applications it is necessary to compare two competing methods (for example, to compare treatment effects of a standard drug and an experimental drug). To compare two methods from statistical point

More information

Chapter 7, continued: MANOVA

Chapter 7, continued: MANOVA Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to

More information

Math 108b: Notes on the Spectral Theorem

Math 108b: Notes on the Spectral Theorem Math 108b: Notes on the Spectral Theorem From section 6.3, we know that every linear operator T on a finite dimensional inner product space V has an adjoint. (T is defined as the unique linear operator

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University Experimental design Matti Hotokka Department of Physical Chemistry Åbo Akademi University Contents Elementary concepts Regression Validation Hypotesis testing ANOVA PCA, PCR, PLS Clusters, SIMCA Design

More information

ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals

ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Matched Filter Generalized Matched Filter Signal

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Lecture 5: Hypothesis tests for more than one sample

Lecture 5: Hypothesis tests for more than one sample 1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay Lecture 5: Multivariate Multiple Linear Regression The model is Y n m = Z n (r+1) β (r+1) m + ɛ

More information

STAT 730 Chapter 5: Hypothesis Testing

STAT 730 Chapter 5: Hypothesis Testing STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The

More information