Stat 206, Week 6: Factor Analysis

Size: px
Start display at page:

Download "Stat 206, Week 6: Factor Analysis"

Transcription

1 Stat 206, Week 6: Factor Analysis James Johndrow Introduction Factor analysis aims to explain correlation between a large set (p) of variables in terms of a smaller number (m) of underlying factors. The factors are assumed unobserved, and there is also observation noise/random error. Spearman thought of the following example. Consider children s exam performance in X 1 = classics, X 2 = french, X 3 = English, with observation correlation matrix R = A model to reduce from p = 3 to m = 1 variables is X 1 = l 1 F + ϵ 1 X 2 = l 2 F + ϵ 2 X 3 = l 3 F + ϵ 3. In this model, F is referred to as a common factor 1, l 1, l 2, l 3 are the factor loadings, and ϵ 1, ϵ 2, ϵ 3 are random errors. The common factor might have an interpretation as general ability. The errors ϵ j capture differences in ability in subject j from general ability. 2 The orthogonal factor model 1 or latent factor, since it is unobserved 2 and also that the exam score is an imperfect measure of the ability of a student. We can generalize this to a factor model with m common factors. A model with this structure that generates a random p-vector X is given by X = µ + LF + ϵ where µ is a p ˆ 1 mean vector, L is a p ˆ m factor loadings matrix, F is an mˆ1 vector of factors, and ϵ is a pˆ1 vector of random errors. We can also express this as

2 stat 206, week 6: factor analysis 2 X j = µ j + mÿ l jk F k + ϵ j, where µ j is the mean of X j, l jk is the loading of the jth component of X on the kth factor, F k is the kth common factor, and ϵ j is the jth specific factor. Here j = 1,..., p indexes variables (components of the random vector X), and k = 1,..., m ă p indexes factors. We make the following assumptions on the unobserved random vectors F, ϵ: 1. E[F ] = 0, cov(f ) = I m. 2. E[ϵ] = 0, cov(ϵ) = Ψ, with Ψ a diagonal matrix. 3. cov(f, ϵ) = 0. The name orthogonal factors comes from the assumption cov(f ) = I m. The elements of Ψ = diag(ψ j ) are called specific variances or uniquenesses. There are important consequences of these assumptions for covariance structure cov(x, F ) = L. Let s write these out coordinate-wise Σ = cov(x) = LL 1 + Ψ cov(x j, F k ) = l jk, is the loading of the jth variable on the kth factor, and where Σ jj = var(x j ) = (LL 1 ) jj + ψ j h 2 j + ψ j mÿ h 2 j = (LL 1 ) jj = l 2 jk is the ith communality. Consider again Spearman s example. Here p = 3, m = 1, and

3 stat 206, week 6: factor analysis 3 L = Σ = l 1 l 2 l 3 l 1 l 2 l 3, ) ψ (l 1 l 2 l 3 + ψ 2 0 l ψ 1 l 1 l 2 l 1 l 3 = l ψ 2 l 2 l 3 l ψ 3 An identifiability issue exists with the model as we have presented it thus far. If Γ is any m ˆ m orthogonal matrix, then ψ 3 and so we can replace LF = (LΓ)(Γ 1 F ), L Ñ LΓ, F Ñ Γ 1 F without changing the dependence structure of X, since Σ = LL 1 + Ψ = LΓΓ 1 L 1 + Ψ. Since the model doesn t change, we ll later use this in two ways 1. To rotate the factors to make them more interpretable 2. To assist in optimization for maximum likelihood estimation. Exercise 1. This describes the non-identifiability completely. Prove that if LL 1 = L L 1 then there exists an m ˆ m orthogonal matrix Γ such that L = LΓ. (Hint: use the singular value decomposition). We can make the model identifiable by imposing constraints on the loadings. For example, we can constrain L 1 Ψ 1 L = diag(a 1,..., a m ) (1) to be an m ˆ m diagonal matrix, which is equivalent to saying that the columns of Ψ 1/2 L are orthogonal. Sometimes the variant constraint that LD 1 L 1 is diagonal, where D = diag(σ 11,..., Σ mm ) is used. The model accomplises dimension reduction. We have seen that the orthogonal factor model yields the decomposition

4 stat 206, week 6: factor analysis 4 Σ = LL 1 + Ψ. (2) Σ has p diagonal and p(p 1)/2 off-digonal parameters, for a total of ν = p(p + 1)/2. Now let s count the parameters on the right side of (2). Since Ψ is diagonal, it has p parameters. It looks like L has pm parameters, but (1) imposes m(m 1)/2 constraints, so the right side actually has ν 0 = pm m(m 1)/2 + p free parameters. The reduction in parameters from general Σ to the form (2) is thus s = p(p + 1)/2 pm + m(m 1)/2 p = 1 2 [(p m)2 (p + m)]. Generally, s ą 0, and then represents the extent of dimension reduction accomplished by the factor model. The value s can be huge when m! p, so factor models are particularly attractive in highdimensional settings when p is large, particularly if n is not also very large. If Σ = LL 1 + Ψ represents a null hypothesis, and general Σ represents the alternative, then s = ν ν 0 also gives the degrees of freedom of the asymptotic χ 2 distribution of the likelihood ratio test (see below). The factor model is scale invariant. If X follows the factor model X = µ + LF + ϵ then so does CX for a diagonal matrix C: Y = CX = Cµ + CLF + Cϵ = µ C + L C F + ϵ C where cov(ϵ C ) = CΨC 1 is diagonal since C is diagonal. So, in this sense, factor analysis is unaffected by rescaling of the variables. Use of the (sample) correlation matrix Thus we may standardize the variables Z = V 1/2 (X µ), V = diag(σ)

5 stat 206, week 6: factor analysis 5 and specify a factor model for the population correlation matrix P = L z L 1 z + Ψ z, L z = V 1/2 L, Ψ z = V 1/2 ΨV 1/2, with data x i we standardize z i = p V 1/2 (x i sx) with p V = diag(s) and fit R = p L z p L 1 z + p Ψ z. In this case p ψ j = 1 ph jj = 1 řm i=1 p l 2 ji, and p Σ jj = s jj. Returning to Spearman s example, p = 3, m = 1 so s = 1/2[(3 1) 2 (3 + 1)] = 0. In fitting = L p zl p1 z + Ψ p z, 1 there are 6 equations in 6 unknowns. Indeed, you can check that.983 pl z =.844, Ψz p = diag(.034,.278,.370)..794 Methods of estimation One method uses principal components. Motivation: start from spectral decomposition of Σ, Σ = UΛU 1. We seek an approximation with a small number m ă p factors. A natural choice the best rank m approximation, as we have seen is to set Σ m = U m Λ m U 1 m = U m Λ 1/2 m Λ 1/2 m U m = U m Λ 1/2 m (U m Λ 1/2 m ) 1 = LL 1 (3) where U m consists of the first m columnns of u and Λ m = diag(λ 1,..., λ m ). Our factor model says that Σ has the form LL 1 + Ψ, so if the L in (3) were the correct one, then recalling that Ψ is diagonal we might set Ψ = diag(σ LL 1 ). With actual data, we mimic these steps, either using S or R

6 stat 206, week 6: factor analysis 6 1. Perform a spectral decomposition of S or R 2. Choose m ă p, define L = p U m p Λ 1/2 m 3. Set Ψ = diag(s L L 1 ). How good is this fit? How do we choose m? 1. We can look at the residual matrix m = S ( L L 1 + Ψ), which we would like to be small. By definition, the diagonal elements of m are zero, and it can be shown that m 2 F = p λ 2 m p λ 2 p, so one guide would be to consider m such that the right side is small compared to ř p λ2 j. 2. We might alternatively evaluate the jth common factor by its contribution worards the total variance. Earlier, we saw that Σ jj = h 2 j + ψ j = mÿ l 2 jk + ψ j (4) so that the contribution to Σ jj from the kth factor is l 2 jk, and the contribution to the total variance tr(s) = ř p Σ jj by the kth factor is k=1 pÿ l 2 jk. (5) N.B.: in the matrix of squared loadings (l 2 jk ) in (4), we take a row sum; in (5) it s a column sum. With our principal component estimate, we take the sum of squares of the kth column pÿ l 2 jk = λ p k pe k 2 2 = λ p k, so the proportional contribution to the total variance from the kth factor is pλ k tr(s) = λk p ř p p. λ j

7 stat 206, week 6: factor analysis 7 Estimation for factor models Let s estimate factor models for the djia data. load('../../datasets-other/djia/djia.rdata') df <- djia.ldr[,3:ncol(djia.ldr)] n <- nrow(df); p = ncol(df) df <- data.frame(df) names(df) <- colnames(df) for (j in 1:p) { Fj <- ecdf(df[,j]) df[,j] <- qnorm(fj(df[,j])-1/(2*n)) } R <- cor(df); eig <- eigen(r) lambda <- eig$values cper <- data.frame(per.exp = cumsum(lambda)/sum(lambda)) cper$j <- seq(p) ggplot(cper,aes(x=j,y=per.exp)) + geom_point() + ylim(c(0,1)) per.exp Gamma = eig$vectors L = Gamma%*%diag(sqrt(lambda)) m <- 4 L4 <- L[,1:m] j Figure 1: cumulative percent explained, djia data df.l = data.frame(l4) names(df.l) <- paste('l',seq(3),sep='') df.l$stock <- names(df) df.l <- melt(df.l,id='stock') ggplot(df.l,aes(y=stock,x=variable,fill=value)) + geom_tile() psi <- diag(r)-diag(l4%*%t(l4)) res3 <- R - L4%*%t(L4)-diag(psi) sum(c(res3^2)) ## [1] stock xom wmt vz utx unh trv pfe pgt nke msft mrk mmm mcd jpm ko ibm intc jnj hd ge gs dis csco dd cat ba axp L1 L2 L3 NA variable value max(c(res3^2)) ## [1] sum(c(r^2)) ## [1]

8 stat 206, week 6: factor analysis 8 # res3 <- abs(res3)/abs(r) # standardize to give a better notion of how much error there is in each elem # res3 <- data.frame(res3) # names(res3) <- names(df) # res3$stock <- names(df) # res3 <- melt(res3,id='stock') # ggplot(res3,aes(x=variable,y=stock,fill=value)) + geom_tile() Maximum likelihood method load('../../datasets-other/djia/djia.rdata') df <- djia.ldr[,3:ncol(djia.ldr)] n <- nrow(df); p = ncol(df) df <- data.frame(df) names(df) <- colnames(df) for (j in 1:p) { Fj <- ecdf(df[,j]) df[,j] <- qnorm(fj(df[,j])-1/(2*n)) } fit4 <- factanal(df,4,rotation="none") colsums(fit4$loadings^2) ## Factor1 Factor2 Factor3 Factor4 ## i.psi = diag(1/fit4$uniquenesses) L = fit4$loadings LPL = t(l)%*%i.psi%*%l # check diagonal max(lpl-diag(diag(lpl))) ## [1] e-16 Psi = diag(fit4$uniquenesses) res2 = R-(L%*%t(L)+Psi) sum(c(res2^2)) ## [1] max(c(res2^2)) ## [1] sum(r^2) ## [1] Rhat <- L%*%t(L)+Psi s <- ((p-m)^2-(p+m))/2

9 stat 206, week 6: factor analysis 9 References

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

DYNAMIC VS STATIC AUTOREGRESSIVE MODELS FOR FORECASTING TIME SERIES

DYNAMIC VS STATIC AUTOREGRESSIVE MODELS FOR FORECASTING TIME SERIES DYNAMIC VS STATIC AUTOREGRESSIVE MODELS FOR FORECASTING TIME SERIES Chris Xie Polytechnic Institute New York University (NYU), NY chris.xie@toprenergy.com Phone: 905-93-0577 June, 008 Electronic copy available

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

Applied Multivariate Analysis (Stat 206)

Applied Multivariate Analysis (Stat 206) Applied Multivariate Analysis (Stat 206) James Johndrow 2016-09-26 Outline of the course This course covers statistical methods for learning from multivariate data. A tentative list of topics, with relevant

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

3.1. The probabilistic view of the principal component analysis.

3.1. The probabilistic view of the principal component analysis. 301 Chapter 3 Principal Components and Statistical Factor Models This chapter of introduces the principal component analysis (PCA), briefly reviews statistical factor models PCA is among the most popular

More information

Stat 206: the Multivariate Normal distribution

Stat 206: the Multivariate Normal distribution Stat 6: the Multivariate Normal distribution James Johndrow (adapted from Iain Johnstone s notes) 16-11- Introduction The multivariate normal distribution plays a central role in multivariate statistics

More information

Stat 206: Sampling theory, sample moments, mahalanobis

Stat 206: Sampling theory, sample moments, mahalanobis Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because

More information

Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models

Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models Diaa Noureldin Department of Economics, University of Oxford, & Oxford-Man Institute, Eagle House, Walton Well Road, Oxford OX

More information

Multivariate elliptically contoured stable distributions: theory and estimation

Multivariate elliptically contoured stable distributions: theory and estimation Multivariate elliptically contoured stable distributions: theory and estimation John P. Nolan American University Revised 31 October 6 Abstract Mulitvariate stable distributions with elliptical contours

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Factor Analysis Edpsy/Soc 584 & Psych 594

Factor Analysis Edpsy/Soc 584 & Psych 594 Factor Analysis Edpsy/Soc 584 & Psych 594 Carolyn J. Anderson University of Illinois, Urbana-Champaign April 29, 2009 1 / 52 Rotation Assessing Fit to Data (one common factor model) common factors Assessment

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 11 November 2, 2005 Multivariate Analysis Lecture #11-11/2/2005 Slide 1 of 58 Today s Lecture Factor Analysis. Today s Lecture Exploratory factor analysis (EFA). Confirmatory

More information

R = µ + Bf Arbitrage Pricing Model, APM

R = µ + Bf Arbitrage Pricing Model, APM 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

Vector autoregressions, VAR

Vector autoregressions, VAR 1 / 45 Vector autoregressions, VAR Chapter 2 Financial Econometrics Michael Hauser WS17/18 2 / 45 Content Cross-correlations VAR model in standard/reduced form Properties of VAR(1), VAR(p) Structural VAR,

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

CENTRE FOR ECONOMETRIC ANALYSIS

CENTRE FOR ECONOMETRIC ANALYSIS CENTRE FOR ECONOMETRIC ANALYSIS CEA@Cass http://www.cass.city.ac.uk/cea/index.html Cass Business School Faculty of Finance 106 Bunhill Row London EC1Y 8TZ Co-features in Finance: Co-arrivals and Co-jumps

More information

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM. 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

Pollution Sources Detection via Principal Component Analysis and Rotation

Pollution Sources Detection via Principal Component Analysis and Rotation Pollution Sources Detection via Principal Component Analysis and Rotation Vanessa Kuentz 1 in collaboration with : Marie Chavent 1 Hervé Guégan 2 Brigitte Patouille 1 Jérôme Saracco 1,3 1 IMB, Université

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 4: Factor analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering Pedro

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

1. Introduction to Multivariate Analysis

1. Introduction to Multivariate Analysis 1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with

More information

Economics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean.

Economics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean. Economics 573 Problem Set 5 Fall 00 Due: 4 October 00 1. In random sampling from any population with E(X) = and Var(X) =, show (using Chebyshev's inequality) that sample mean converges in probability to..

More information

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Machine Learning Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1 / 47 Table of contents 1 Introduction

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II the Contents the the the Independence The independence between variables x and y can be tested using.

More information

Principal Component Analysis of High Frequency Data

Principal Component Analysis of High Frequency Data Principal Component Analysis of High Frequency Data Yacine Aït-Sahalia Dacheng Xiu Department of Economics, Princeton University Booth School of Business, University of Chicago FERM 214, Central University

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

STA442/2101: Assignment 5

STA442/2101: Assignment 5 STA442/2101: Assignment 5 Craig Burkett Quiz on: Oct 23 rd, 2015 The questions are practice for the quiz next week, and are not to be handed in. I would like you to bring in all of the code you used to

More information

Professor Wiston Adrián RISSO, PhD Institute of Economics (IECON), University of the Republic (Uruguay)

Professor Wiston Adrián RISSO, PhD   Institute of Economics (IECON), University of the Republic (Uruguay) Professor Wiston Adrián RISSO, PhD E-mail: arisso@iecon.ccee.edu.uy Institute of Economics (IECON), University of the Republic (Uruguay) A FIRST APPROACH ON TESTING NON-CAUSALITY WITH SYMBOLIC TIME SERIES

More information

Key Algebraic Results in Linear Regression

Key Algebraic Results in Linear Regression Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Lecture 6: April 19, 2002

Lecture 6: April 19, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models Spring 2002 Lecturer: Jeff Bilmes Lecture 6: April 19, 2002 University of Washington Dept. of Electrical Engineering Scribe: Huaning Niu,Özgür Çetin

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Bayesian Stochastic Volatility (SV) Model with non-gaussian Errors

Bayesian Stochastic Volatility (SV) Model with non-gaussian Errors rrors Bayesian Stochastic Volatility (SV) Model with non-gaussian Errors Seokwoo Lee 1, Hedibert F. Lopes 2 1 Department of Statistics 2 Graduate School of Business University of Chicago May 19, 2008 rrors

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Lecture 5: Hypothesis tests for more than one sample

Lecture 5: Hypothesis tests for more than one sample 1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

ESTIMATION AND FORECASTING OF LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE LAURENT A. F. CALLOT

ESTIMATION AND FORECASTING OF LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE LAURENT A. F. CALLOT ESTIMATION AND FORECASTING OF LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE LAURENT A. F. CALLOT VU University Amsterdam, The Netherlands, CREATES, and the Tinbergen Institute. ANDERS B. KOCK

More information

Estimation and Forecasting of Large Realized Covariance Matrices and Portfolio Choice. Laurent A. F. Callot, Anders B. Kock and Marcelo C.

Estimation and Forecasting of Large Realized Covariance Matrices and Portfolio Choice. Laurent A. F. Callot, Anders B. Kock and Marcelo C. Estimation and Forecasting of Large Realized Covariance Matrices and Portfolio Choice Laurent A. F. Callot, Anders B. Kock and Marcelo C. Medeiros CREATES Research Paper 2014-42 Department of Economics

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = = 18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013 1. Consider a bivariate random variable: [ ] X X = 1 X 2 with mean and co [ ] variance: [ ] [ α1 Σ 1,1 Σ 1,2 σ 2 ρσ 1 σ E[X] =, and Cov[X]

More information

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Capítulo 12 FACTOR ANALYSIS 12.1 INTRODUCTION

Capítulo 12 FACTOR ANALYSIS 12.1 INTRODUCTION Capítulo 12 FACTOR ANALYSIS Charles Spearman (1863-1945) British psychologist. Spearman was an officer in the British Army in India and upon his return at the age of 40, and influenced by the work of Galton,

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution Moment Generating Function STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution T. Linder Queen s University Winter 07 Definition Let X (X,...,X n ) T be a random vector and

More information

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley Zellner s Seemingly Unrelated Regressions Model James L. Powell Department of Economics University of California, Berkeley Overview The seemingly unrelated regressions (SUR) model, proposed by Zellner,

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information