I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Size: px
Start display at page:

Download "I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN"

Transcription

1 Principal Analysis Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees, University of Illinois Principal Analysis Slide 1 of 93

2 Outline History and overview Geometry Algebra Principal obtained from Standardized Variables Graphing Principal Distinctions between PCA and factor analysis Graphing Principal Reading: Johnson & Wichern pages & ; good supplemental references Jolliffe (1986), Krzanowski (1988); Flury (1988). Principal Analysis Slide 2 of 93

3 History Graphing Principal First introduced by Karl Pearson (1901) in Philosophical Magazine as a procedure for finding lines and planes which best fit a set of points in p-dimensional space. The focus was on geometric optimization. Harold Hotelling (1933) published a paper on PCA in Journal of Educational Psychology, which dealt with an algebraic optimization. He re-invented it but from a different perspective. His motivation was to find a smaller fundamental set of independent variables that determines the values of the original set of p variables. This is a factor analytic type idea, but PCA is not factor analysis (except in a very special and unrealistic case). Hotelling choose components (linear combinations of p variables) so as to maximize their successive contribution to the total variance. Principal Analysis Slide 3 of 93

4 History continued Not much was done with respect to applications until the early 1960 s the advent of the computer age. There was an explosion of applications and developments of the technique. Theory for sampling distributions (which lead to statistical inference) were developed. Lots of Extensions of PCA (e.g., PCA for sets of matrices... for SAS/IML macros (by me) and MATLAB (by Mark de Rooij) code see algorithm is based on work by Kiers (1990). Graphing Principal Principal Analysis Slide 4 of 93

5 Basic Idea Reduce the dimensionality of a data set in which there is a large number of inter-related variables while retaining as much as possible the variation in the original set of variables. The reduction is achieved by transforming the original variables to a new set of variables, principal components, that are uncorrelated and ordered such that the first few retains most of the variation present in the data. Goals & Objectives Reduction and summary data reduction. Study the structure of Σ (or S or R) Interpretation. Graphing Principal Principal Analysis Slide 5 of 93

6 Applications Interpretation (study structure) Create a new set of variables (a smaller number that are uncorrelated). These can be used in other procedures (e.g., multiple regression). Select a sub-set of the original variables to be used in other multivariate procedures. Detect outliers or clusters of observations. Check multivariate normality assumption (before assuming multivariate normality and analyzing data using procedures that assume multivariate normality. Graphing Principal Principal Analysis Slide 6 of 93

7 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 7 of 93 Probability Contour All your observations (measurements) on made on the members of the population. European countries in one study could be considered the population and you have data for each of them (the variables are percents of people employed in different industries). The psychological test data consist of measurements on 64 subjects. These subjects are a sample from some populations. If we repeated the study, we d most likely have different individuals. In Population principal components, we can compute Σ and the principal components (PCs) are derived from Σ.

8 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 8 of 93 Probability Contour Two approaches Algebraically: PCs are linear combinations of p original variables X 1, X 2,..., X p such that The first PC has the largest variance as possible, The second PC has the largest variance as possible and is orthogonal to the first etc. Geometrically: (at least) 3 approaches Rotation to a new coordinate system. Best fit hyper-plane. See appendix of the text for n space interpretation

9 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 9 of 93 Probability Contour Geometry of PCA: p space PCs represent a selection of a new coordinate system obtained by rotating the original axes to a set of new axes (to provide a simpler structure). The first principal component represents the direction of maximum variability. The second principal component represents the direction of maximum variability that is orthogonal to the first. And so on, until the last PC which represents the direction of minimum variability & orthogonal to all of the others. Best fit is defined as minimizing the sum of squared distances between points that represent cases and space defined by principal components The first principal component defines a line. The sum of squared distances (i.e., 2 j=1 d2 j ) between the points and this line are minimized. The first Two principal components define a plane. The sum of squared distances between points and this plan are minimized. etc.

10 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 10 of 93 Probability Contour Axis Rotation & Best Fit Line x 2 (weight), varx 2 = var(y 1 ) = y 1 (size) sum of (distance)2 are minimized ւ x 1 (height) var(x 1 ) = θ y 2 (shape) var(y 2 ) = 0.236

11 Further Notes regarding PC They are variance preserving. For example, Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 11 of 93 Probability Contour var(x 1 )+var(x 2 )= =2.015 = =var(y 1 )+var(y 2 ) If you rotate PCs, you no longer have PCs. PCs only depend on Σ (or R if you re using standardized variables). PCs do not require any assumptions about distribution of the variables (e.g., multivariate normality). If variables do come from a multivariate normal populations, then PCs can be interpreted in terms of constant density ellipsoids. You can make inferences about the population from a sample. However, right now we re considering Population PC, so we don t have a sample and hence no inference is required.

12 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 12 of 93 Probability Contour The Algebra of Population PCA We want to transform p variables to q orthogonal linear combinations (generally) where q << p. to X 1 p = (X 1, X 2,...,X p ) Y 1 q = (Y 1, Y 2,...,X q ) There are p possible ones Y 1 = a 1X = a 11 X 1 + a 12 X a 1p X p Y 2 = a 2X = a 21 X 1 + a 22 X a 2p X p.. Y p = a px = a p1 X 1 + a p2 X a pp X p Y = AX Given the covariance matrix Σ X of the X s, we know var(y i ) = a iσ X a i and cov(y i, Y k ) = a iσ X a k

13 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 13 of 93 Probability Contour More Formal Definition of PCs PCs are the uncorrelated linear combinations, cov(y i, Y k ) = 0 for all i k, with variances as large as possible. In particular, var(y 1 ) is the maximum find a 1 a 1Σ X a 1 = max(a Σ x a) var(y 2 ) is the maximum and Y 1 find a 2 a 2Σ X a 2 = max(a Σ x a) and a 1 Σ X a 2 = 0 At each step, select a i such that a i X has maximum variance subject to being uncorrelated with all other linear combinations. Usually (but not always), we only use Y 1, Y 2,..., Y q where q is much less than p (primary goal is data reduction. The are p possible components, Y 1, Y 2,..., Y p are needed to completely reproduce (represent) Σ X. So if q < p, we don t reproduce Σ X exactly (unless the rank of Σ X = q).

14 Maximizing the Criteria The criteria to be maximized is max(a Σ X a). Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 14 of 93 Probability Contour We can always multiply Y 1 = a X by a constant c > 1, which will increase the variance, varcy 1 = var(ca X) = c 2 var(a X). Therefore, we normalize the combination vector a a = 1 = L 2 a = La Our problem is to find a 1 that maximizes variance subject to a constraint ( a ) Σ X a max = var(y a a 1 ) a Use results on maximization in more linear algebra notes ( a ) Σ X a max = λ a a 1 a which is attained when a = e 1 where λ 1 and e 1 are the first eigenvalue and eigenvector of Σ X.

15 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 15 of 93 Probability Contour Proof that this is Maximum Showing is better than just believing... a Σ X a a a = var(y 1 ) a Σ X a = var(y 1 )a a a Σ X a var(y 1 )a a = 0 a (Σ X a var(y 1 )a) = 0 (since a 0) Σ X a var(y 1 )a = 0 Σ }{{} X }{{} a p p p 1 = var(y 1 ) }{{}}{{} a scalar p 1 which is just the equation what eigenvalues and eigenvectors solve. So Y 1 = e 1X where e 1 is the 1 st eigenvector of Σ X and var(y 1 ) = λ 1.

16 Population PC: Result 1 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 16 of 93 Probability Contour Let Σ be the covariance matrix associated with the vector X = (X 1, X 2,...,X p ). Let Σ have the eigenvector-eigenvalues pairs (λ 1, e 1 ), (λ 2, e 2 ),..., (λ p, e p ) where λ 1 λ 2 λ p 0. Then the i th PC is given by Y i = e ix = e i1 X 1 + e i2 X e ip X p for i = 1, 2,...,p. Given this var(y i ) = e iσe i = e i(λ i e i ) = λ i e ie i = λ i and for i k cov(y i, Y k ) = e i Σe k = e i(λ k e k ) = λ k e ie k = 0 If some of the λ 1 are equal, then the choice of the corresponding coefficient vectors e i (and thus Y i ) are not unique.

17 Population PC continued We can write all of this in terms of matrices: Y = P X = cov(y ) = Σ Y = P Σ X P Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 17 of 93 Probability Contour So, Σ X }{{} cov(x) = PΛP Σ Y }{{} cov(y ) = Λ = P Σ X P = diag(λ i )

18 More Population PC Results Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 18 of 93 Probability Contour Let X = (X 1, X 2,...,X p ) have covariance matrix Σ X with eigenvalue and eigenvector pairs pairs (λ 1, e 1 ), (λ 2, e 2 ),..., (λ p, e p ) where λ 1 λ 2 λ p 0. Let Y 1 = e 1X, Y 2 = e 2X,... Y p = e px be the PCs. Then σ 11 + σ σ pp = p σ ii = λ 1 + λ λ p = i=1 The Total Population Variance is preserved by the transformation. The Proportion of total variance due to the k th PC is λ k λ 1 + λ λ p = k p i=1 λ i k = 1,...,p p i=1 λ i

19 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 19 of 93 Probability Contour Proportion of Variance Accounted For We often select q PCs such that the proportions for k = 1,...,q sum up as close to 1 (yet not too large of a value for q). The Proportion of Variance accounted for by the first q PCs equals q k=1 λ k trace(σ X ) We try to balance the percent of variance (information) retained and the number of PCs (simplicity). We may want to replace X by Y. Often we re interested interpreting the new variables (i.e., the PCs), so we examine the elements of the e i s The size (magnitude) of the elements of e i are an indicator of a variables importance to the i th PC....

20 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 20 of 93 Probability Contour Correlation between Y i and X k If Y 1 = e i X, Y 2 = e 2 X,... Y p = e p X are the PCs obtained from Σ X, we can use ρ Yi,X k to help interpret the contribution of an X k to Y i. ρ Yi,X k = cov(y i, X k ) λi σkk = cov(e i X, l X) where l 1 p = (0,..., 1 λi σkk }{{}, 0,..., 0) k th position = = l Σe i λi σkk l (λ i e i ) λi σkk = e ik λi σkk

21 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 21 of 93 Probability Contour Example: European Cars The data are percentages of people employed in different industries in European countries during 1979 (cold war era). Data from Euromonitor (1979) European Marketing Data and Statistics, London: Euromonitor Publications... I go it off of the web from N = 26 countries There are 9 industries, but we ll start with just p = 3: X 1 = percent in manufacturing. X 2 = percent in services industry. X 3 = percent in social and personal services. µ = Σ = total variance = trace(σ) =

22 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 22 of 93 Probability Contour Example: Eigenvalues and Eigenvectors var(y i ) Cumulative Cumulative i λ i variance Percent Percent Eigenvectors, which give weights for principal components: e 1 = (0.580, 0.396, 0.712) e 2 = (0.811, 0.207, 0.546) e 3 = ( 0.069, 0.894, 0.442) So the Principal component are Y 1 = 0.580X X X 3 Y 2 = 0.811X X X 3 Y 3 = 0.069X X X 3

23 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 23 of 93 Probability Contour Example: Interpretation of We ll look at correlations between Y 1 and Y 2 and each of the X k s: Principal Original Variables Y 1 Y 2 Manufacturing X (0.580) = (0.811) =.75 Service X (0.396) =.69 Social & Personal X (0.712) = ( 0.207) = ( 0.546) =.52 Y 1 : All variables are contributing to the first component; it s an overall percent employment in all industries. Y 2 : This contrasts Manufacturing with Service and Social & Personal.

24 Plot of Component Scores Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 24 of 93 Probability Contour

25 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 25 of 93 Probability Contour If Population is Multivariate Normal We have an additional interpretation if X N p (µ,σ). Recall that the probability density contours (ellipsoids) are (X µ) Σ 1 (X µ) The center is at µ and the axes are at µ ± c λ i e i, where λ i and e i are the i th eigenvalue and vector of Σ. The principal components are Y 1 = e 1X Y 2 = e 2X.. Y p = e px The Principal components lie in the same directions as the axes of the probability contours (ellipsoids)

26 Probability Contour X 2 (x j1, x j2 ) = µ + ce 1 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 26 of 93 Probability Contour Y 2 = e 2X (x j1, x j2 ) Y 1 = e 1X X 1

27 Center at (0, 0) X 2 = (X 2 µ 2 ) In X coordinates: (x j1, x j2 ) = ce 1 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 27 of 93 Probability Contour Y 2 = e 2X տ (x j1, x j2 ) Y 1 = e 1X X = (X 1 µ 1 ) In PC coordinates: (Y i, 0) = (e 11 X 1 + e 12 X, 0)

28 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 28 of 93 Probability Contour Summary example when X N p (µ, Σ) Any point on the i th axis of the ellipsoid has X coordinates = µ + ce i. X coordinates that are proportional to e i = (e i1, e i2,...,e ip ) in the coordinate system that has origin at µ and axes parallel to the original X axes (i.e., the X coordinates). Subtracting mean doesn t change anything except move the origin to (0, 0). In the coordinate system of the PC s the point has principal component (Y i, 0), because PC s are obtained by a rigid rotation of the original coordinate axes through an angle θ until they coincide with the axes of the ellipsoid. All of these results generalize to p > 2.

29 When Variances are Very Different Principal obtained from Standardized Variables If we use standardized variables ( z-scores) Z 1 = X 1 µ 1 σ11 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 29 of 93 Probability Contour or in matrix notation Z = V 1/2 }{{} diag(1/ σ ii ) Z 2 = X 2 µ 2 σ22.. Z p = X p µ p σpp (X µ) }{{} p 1 = V 1/2 X V 1/2 µ So Z is a linear combination of X, which means...

30 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 30 of 93 Probability Contour PCs of Standardized Variables We know that E(Z) = E(V 1/2 (X µ)) = V 1/2 E(X) V 1/2 µ = 0 }{{} µ and Σ Z = V 1/2 Σ X V 1/2 = R which is the (population) correlation matrix of the X s. The i th PC of the standardized variables Z = (Z 1, Z 2,...,Z p ) with Σ Z = R is given by Ỹ = ẽ iz = ẽ i(v 1/2 (X µ)) for i = 1, 2,...,p where ẽ i is the i th eigenvector and λ i is the i th eigenvalue of R. Note that p i=1 var(z i ) = p λ i = i=1 p var(ỹ i ) = trace(r) = p i=1

31 PCs of Standardized versus non-std Variables Almost always λ i λ i and e i ẽ i Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 31 of 93 Probability Contour That is The PCs from Σ X are not the same as PCs from R We ll look at a situation where standardization makes a difference This will be the case when the scales of the X variables are (substantially or vastly) different and they are ont comparable.

32 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 32 of 93 Probability Contour Men s Track Data From Johnson & Wichern: The data are from the Track and Field Statistics Handbook for the 1984 Los Angeles Olympics. These data are the national record times for men before the 1984 Olympics. The record times for eight races (i.e., p = 8) are listed for 55 countries (i.e., n = 55). The times are recorded for the following races: m: Record time for 100m race in seconds m: Record time for 200m race in seconds m: Record time for 400m race in seconds m: Record time for 800m race in minutes m: Record time for 1500m race in minutes 6. 5K: Record time for 5000m race in minutes 7. 10K: Record time for 10000m race in minutes 8. Marathon: Record time for the Marathon (approx. 26 miles) in minutes

33 Summary Statistics Summary Statistics for each variable are given below: Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 33 of 93 Probability Contour 100m 200m 400m 800m 1500m 5K 10K Marathon x s Covariance Matrix (truncated values) m100 m200 m400 m800 m1500 K5 K10 Mara. m m m m m K K Marathon

34 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 34 of 93 Probability Contour Eigenvalues of Σ From the SAS/PRINCOMP Procedure: Total Variance Eigenvalues of the Covariance Matrix Eigenvalue Difference Proportion Cumulative

35 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 35 of 93 Probability Contour Eigenvectors of Σ Principal Race Prin1 Prin2 Prin3 m m m m m K K Marathon The 1st principal component is essentially the marathon, because it has by far the largest variance compared to the next largest which is 3.26 (the 10K). The variance on the 1st component is

36 The Correlation Matrix Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 36 of 93 Probability Contour Values are Truncated m100 m200 m400 m800 m1500 K5 K10 Mara. m m m m m K K Marathon

37 The Eigenvalues of the Correlation Matrix Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 37 of 93 Probability Contour Eigenvalues of the Covariance Matrix Eigenvalue Difference Proportion Cumulative Total variance = 8. The first 2 principal components account for 93.8% of the total variance.

38 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 38 of 93 Probability Contour The Eigenvectors of the Correlation Matrix The First Two Eigenvectors Component Loadings Race m m m m m K k Marathon First component: An overall measure High values on this component indicate slower runners. Second component: Contrast long and short races Small values indicate faster on short races than long ones. Large values indicate slower on short races than long ones. Value near zero means that tend to be similar on short and long races (could be slow, fast, or somewhere in between on all races).

39 Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 39 of 93 Probability Contour Correlations(Variables, ) The correlations between the standardized variables and values on the principal components equal r Zk,Y i = λ i e ki (e.g., 6.622(.318) =.82) Race m m m m m K k Marathon Interpretation pretty much the same.

40 Graph of Countries Component Scores Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 40 of 93 Probability Contour

41 Graph of Countries Component Scores Two approaches Geometry of PCA: p space Axis Rotation & Best Fit Line Further Notes regarding PC The Algebra of Population PCA More Formal Definition of PCs Maximizing the Criteria Proof that this is Maximum Population PC: Result 1 Population PC continued More Population PC Results Proportion of Variance Accounted For Correlation between Y i and X k Example: European Cars Example: Eigenvalues and Eigenvectors Example: Interpretation of Plot of Component Scores If Population is Multivariate Normal Principal Analysis Slide 41 of 93 Probability Contour

42 Used to summarize the sample variation by PCs. Sample Principal Graphing Principal The Algebra is the same as in population principal components. x 1, x 2,...,x n are n independent observations from a population with µ and Σ. x p 1 = sample mean vector. S p p = {s ik } = sample covariance matrix. S has eigenvalue/vector pairs (ˆλ 1, ê 1 ),...,(ˆλ p, ê p ) where ˆλ 1 ˆλ 2 ˆλ p. The ˆ indicates these are estimates of population values. The i th sample principal component is given by ŷ i = ê i x = ê i1 x 1 + ê i2 x ê ip x p The i th PC sample variance = var(ŷ i ) = ˆλ i for i = 1,...,p. The PC sample covariances = cov(ŷ i, ŷ k ) = 0 for all i k. Principal Analysis Slide 42 of 93

43 Algebra of Sample PC continued Sample Principal Total sample variance trace(s) = tr(s) = p s ii = i=1 p ˆλ i Proportion of total sample variance accounted for by the i th PC ˆλ i p k=1 ˆλ k Correlations between ŷ i and x k rŷi,x k = ˆλi skk ê ik i=1 Graphing Principal Note if you use standardized x s, then rŷi,z k = ˆ λiˆẽ ik The sample PCs based on S are not the same as those based on R. (I ll use to denote those based on R). Use S when observations are not in the same unit or when the variances s ii are not vastly different. Principal Analysis Slide 43 of 93

44 Geometry of Sample PC Sample Principal PCs based on a sample of n p-dimensional observations are new variables specified by a rigid rotation of the original axes to a new orientation such that the directions of the axes in the new orientation have maximum variances in the sample. The rotation must be rigid since the new variables must be. Directions of the new axes are based on S (or R) x 2 Graphing Principal x ŷ 2 = ê 2x θ θ ŷ 1 = ê 1x x 1 Principal Analysis Slide 44 of 93

45 Geometry of Sample continued Sample Principal The PCs are projections of observations onto the principal axes of the ellipsoids. We can re-center the x s, which also centers the ŷ s; that is (x i x) = 0 ŷ i has mean 0 Subtraction of x only effects the mean and does not effect variances ( and ) covariance. ( ) ( ) x 1 x 2 shift location x 1 x 1 x 2 x 2 rigid rotatation ŷ 1 ŷ 2 Graphing Principal ŷ 2 x 2 x 2 ŷ 1 x 1 x 1 Principal Analysis Slide 45 of 93

46 2nd Geometric Interpretation The 1st PC ŷ 1 minimizes the sum of squared deviations (distances) of the points to a line (least squared best fit). Sample Principal When you approximate p-dimensional data by r << p PCs, the t PCs minimize the sum of squared distances of points in p-space onto the r dimensional sub-space. x 2 x 2 ŷ 1 ŷ 2 x 1 x 1 Graphing Principal Principal Analysis Slide 46 of 93

47 Swiss Bank Notes These data are from Flurry & Riedwyl (1988) Multivariate Statistics: A practical approach. Sample Principal The data consist of p = 6 measurements in millimeters on n = 100 genuine Swiss Bank notes (old ones)... show picture x 1 : Length of the bank note, x 2 : Height of the bank note, measured on the left, x 3 : Height of the bank note, measured on the right, x 4 : Distance of inner frame to the lower border, x 5 : Distance of inner frame to the upper border, x 6 : Length of the diagonal. Graphing Principal Principal Analysis Slide 47 of 93

48 Picture of Bank Note X 2 X 5 X 6 X1 Sample Principal X 3 X 4 Graphing Principal Principal Analysis Slide 48 of 93

49 Swiss Bank Notes: sample statistics Sample Means: x = ( , , , 8.305, , ) Sample Principal The sample covariance matrix S: Length Left Right Bottom Top Diagonal X 1 X 2 X 3 X 4 X 5 X 6 X X X X X X Graphing Principal Principal Analysis Slide 49 of 93

50 Eigenvalues of S The variances of the principal components (i.e., the eigenvalues of S): Sample Principal Proportion of Cummulative PC ˆλi of variance Proportion Graphing Principal Principal Analysis Slide 50 of 93

51 Eigenvectors of Genuine Bank notes The principal components (eigenvectors of S): Sample Principal Y 1 Y 2 Y 3 Y 4 Y 5 Y 6 Length X Left X Right X Bottom X Top X Diagonal X Graphing Principal Principal Analysis Slide 51 of 93

52 Correlation between measures and PCs The correlations between the original variables and the principal components (i.e., r xk,y i = ê ki ˆλi / s kk ): Sample Principal Graphing Principal Y 1 Y 2 Y 3 Y 4 Y 5 Y 6 Length X Left X Right X Bottom X Top X Diagonal X Y 1 is a contrast between Bottom & Top. Y 2 is overall size, except for Diagonal. Y 3 & Y 4 nothing obvious. Y 5 is something like image. Y 6 measurement error or slant of cut. Principal Analysis Slide 52 of 93

53 The Latter PCs We ve focused on the first PCs, but the last ones can also be informative. Sample Principal Graphing Principal Small values for the smallest eigenvalues from either S or R indicate: Undetected linear dependencies in the data. One (or more) of the variables is redundant with others and could be deleted. Such PCs can be substantively just an important as PCs associated with the largest eigenvalues. The latter ones could be due to pure error variability (measurement error). Swiss Bank Note example: The last PC is basically X 2 X 3 = (Right) (Left). Typically, X 2 X 3 > 0. So this last PC could Reflect the slant of the cut. If X 2 and X 3 are measuring the same thing (quantity), the only reason that ˆλ 6 > 0 is due to measurement error (error variability). Principal Analysis Slide 53 of 93

54 Asymptotic & complex Graphing Principal If X 1, X 2,...,X n is a sample from N p (µ,σ) then the sample principal components Ŷ i = ê i(x X) are observations ( realizations ) of the population principal components Y i = e i(x µ) and since ŷ i is a linear combination of x which come from N p (µ,σ) Ŷ = where Λ = diag(λ i ). Ŷ 1 Ŷ 2.. Ŷ p N p(0,λ) Principal Analysis Slide 54 of 93

55 continued Assume X j N p (µ,σ) i.i.d. for j = 1, 2,...n. Σ, which is unknown, has eigenvalues λ 1 > λ 2 > > λ p (assumption) with associate eigenvectors e 1, e 2,...,e p. For n very large n 1. ˆλ i is independent of its corresponding ê i. 2. n(ˆλ λ i ) N p (0, 2Λ 2 ) or that ˆλ N p (λ, 2 n Λ2 ) Graphing Principal where ˆλ are eigenvalues of S, and λ are eigenvalues of Σ. So ˆλ i N 1 (λ i, 2 n λ2 i ) for i = 1,...,p 3. n(ê i e i ) N p (0, E i ) where λ k E i = λ i (λ k λ i ) 2 e ke k k i Note:E i is not diagonal, and the Eigenvectors are not independent. Principal Analysis Slide 55 of 93

56 Using Distribution of ˆλ s Since the ˆλ i s are asymptomatically (very large n) independent and normal with mean λ i and variance (2/n)λ 2 i, a (1 α)100% confidence interval for λ i is ˆλ i (1 + z α/2 2/n) λ i ˆλ i (1 z α/2 2/n) where z α/2 is the upper (α/2) th percentile of N(0, 1). Graphing Principal or If we can do a Bonferroni-type procedure and use z α/(2m) where m = number of intervals you plan to constructs. Swiss Bank Note Example: The 95% confidence interval for λ 1 is : λ and the rest are on the next slide (.5395,.9534) Principal Analysis Slide 56 of 93

57 Swiss Bank note: CI s for λ s Proportion of Cumulative 95% Confidence Intervals PC ˆλi of variance Proportion Lower Upper Graphing Principal Principal Analysis Slide 57 of 93

58 Using the Distribution of ê i The ê i s are approximately normal with mean e i. The elements of each ê i are correlated and these correlations depend on the ratios λ k (λ k λ i ) 2 That is, how far λ k is from λ i. It can be useful to look at the diagonal elements of (1/n)Êi. These are the standard errors of ê ki s. Recall Ê i = ˆλ i k i ˆλ i (ˆλ k ˆλ i ) 2 êkê k Graphing Principal Notes: 1. The variances of ˆλ increases as λ increases, so large λ s can have very wide confidence intervals. 2. These sampling results do not apply to R they only apply to S. Principal Analysis Slide 58 of 93

59 Testing H o : λ i = λ for i = (r + 1),...,p) Bartlett (1947) developed a test for the hypothesis that (p r) smaller eigenvalues of Σ are equal for 0 < r < p 1. Graphing Principal If data support this hypothesis, then there probably will be little interest in using more than r components. Bartlett s approximate χ 2 statistics has the following form [ ] r M ln(det(s)) + ln(λ i ) + (p r) ln(λ) where M = n r 1 6 λ = 1 (p r) i=1 ( 2(p r) ( tr(s) r i=1 df = 1 (p r 1)(p r + 2) 2 ) 2 (p r) ) λ i Principal Analysis Slide 59 of 93

60 Bartlett s Test continued Graphing Principal Lawley (1956) gave a modification to Bartlett s test. Anderson (1963) discusses related test; that is, the hypothesis that some k intermediate eigenvalues are equal (i.e., H o : λ 1, λ 2,...,λ q, λ q+1,...λ q+k, λ q+k+1...,λ p }{{} all equal Bartlett s test Swiss Bank note example: p = 6, n = 100, r = 3 and H o : λ 4 = λ 5 = λ 6. ( 2(p r) λ = 1 p r M = n r (p r) = ( ) 2 2(6 3) (6 3) = = ( ) r tr(s) λ i = 1 ( ) = i=1 ) Principal Analysis Slide 60 of 93

61 Swiss Bank Note Example det(s) = Graphing Principal r ln(λ i ) = i=1 Test Statistic = M ( ln(det(s)) + ) r ln(λ i ) + (n r) ln(λ) i=1 = ( ( ) + ( ) + (100 3)(ln(λ))) = df = 1 2 (p r 1)(p r + 2) = 1 2 (2)(5) = 5 Comparing to a chi-square distribution with df = 5, we find p value=.02 How about another value for r? (SAS module). Principal Analysis Slide 61 of 93

62 Graphing Principal Compute y i = e ix and plot these. Reveal suspect observations (outliers, influential observations). Check multivariate normality assumptions. Look for clusters. Provide insight into structure in the data. Suspect Observations The first PCs can help reveal influential observations: those that contribute more to variances than other observations such that if we removed them the results change quite a bit. The last PCs can help to reveal outliers: those observations that are a typical of the data set; they re inconsistent with the rest of the data (could be miss-coded). Graphing Principal Principal Analysis Slide 62 of 93

63 Swiss Bank Notes: Outliers? Graphing Principal Principal Analysis Slide 63 of 93

64 Why Look at Last to find Outliers? Multivariate outliers may not be extreme on any of the original variables. They can still be an outlier in multivariate space because they do not conform with the correlational structure of the rest of the data. Mathematically explanation: Recall that Ŷ p 1 = ˆP p p X p 1 where P = (e 1, e 2,...,e p ). So since PP = P P = I, X = ˆP Ŷ = The X s are a linear combination of the principal components (i.e., the Ŷ s). Consider an observation x j, x j = ˆPŷ j = ŷ 1j ê 1 + ŷ 2j ê ŷ pj ê p = (ŷ 1j ê ŷ q 1,j ê q 1 ) + (ŷ qj ê q + + ŷ pj ê p ) Graphing Principal Principal Analysis Slide 64 of 93

65 Outliers & Influential Observations The size (magnitude) of the last PCs determine how well the first few PCs fit observations; that is, (ŷ 1j ê 1 + +ŷ q 1,j ê q 1 ) differs from x j by (ŷ qj ê q + +ŷ pj ê p ) The suspect observations are the ones where at least one of the coordinates ŷ qj,...,ŷ pj is large. The influential observations are also based on the fact that x j = P y j. Again consider x j = (y 1j e y q 1,j e q 1 ) +(y qj e q + + y pj e p ) }{{} large y values here Graphing Principal Principal Analysis Slide 65 of 93

66 Potential Influential Observations in Men s Track Graphing Principal Principal Analysis Slide 66 of 93

67 Men s Track Data: Influential Observations? Western Somoa and the Cook Islands are off the scale when we did principal component analysis of the Men s track data. Graphing Principal When we removed these two countries... All The Data Without The Two Eigenval. Prop. Cum. Eigenval. Prop. Cum Principal Analysis Slide 67 of 93

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois

More information

Inferences about a Mean Vector

Inferences about a Mean Vector Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

Sample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson

Sample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Comparisons of Two Means Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Combinations of Variables Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

3.1. The probabilistic view of the principal component analysis.

3.1. The probabilistic view of the principal component analysis. 301 Chapter 3 Principal Components and Statistical Factor Models This chapter of introduces the principal component analysis (PCA), briefly reviews statistical factor models PCA is among the most popular

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Carapace Measurements for Female Turtles

Carapace Measurements for Female Turtles Carapace Measurements for Female Turtles Data on three dimensions of female turtle carapaces (shells): X 1 =log(carapace length) X 2 =log(carapace width) X 3 =log(carapace height) ince the measurements

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Outline Week 1 PCA Challenge. Introduction. Multivariate Statistical Analysis. Hung Chen

Outline Week 1 PCA Challenge. Introduction. Multivariate Statistical Analysis. Hung Chen Introduction Multivariate Statistical Analysis Hung Chen Department of Mathematics https://ceiba.ntu.edu.tw/972multistat hchen@math.ntu.edu.tw, Old Math 106 2009.02.16 1 Outline 2 Week 1 3 PCA multivariate

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Stat 206: Sampling theory, sample moments, mahalanobis

Stat 206: Sampling theory, sample moments, mahalanobis Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because

More information

UCLA STAT 233 Statistical Methods in Biomedical Imaging

UCLA STAT 233 Statistical Methods in Biomedical Imaging UCLA STAT 233 Statistical Methods in Biomedical Imaging Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology University of California, Los Angeles, Spring 2004 http://www.stat.ucla.edu/~dinov/

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Factor Analysis Edpsy/Soc 584 & Psych 594

Factor Analysis Edpsy/Soc 584 & Psych 594 Factor Analysis Edpsy/Soc 584 & Psych 594 Carolyn J. Anderson University of Illinois, Urbana-Champaign April 29, 2009 1 / 52 Rotation Assessing Fit to Data (one common factor model) common factors Assessment

More information

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS Mario Romanazzi October 29, 2017 1 Introduction An important task in multidimensional data analysis is reduction in complexity. Recalling that

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Comparisons of Several Multivariate Populations Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Dimension Reduction and Classification Using PCA and Factor. Overview

Dimension Reduction and Classification Using PCA and Factor. Overview Dimension Reduction and Classification Using PCA and - A Short Overview Laboratory for Interdisciplinary Statistical Analysis Department of Statistics Virginia Tech http://www.stat.vt.edu/consult/ March

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Matrices and Deformation

Matrices and Deformation ES 111 Mathematical Methods in the Earth Sciences Matrices and Deformation Lecture Outline 13 - Thurs 9th Nov 2017 Strain Ellipse and Eigenvectors One way of thinking about a matrix is that it operates

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

Quantitative Understanding in Biology Principal Components Analysis

Quantitative Understanding in Biology Principal Components Analysis Quantitative Understanding in Biology Principal Components Analysis Introduction Throughout this course we have seen examples of complex mathematical phenomena being represented as linear combinations

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Comparisons of Several Multivariate Populations

Comparisons of Several Multivariate Populations Comparisons of Several Multivariate Populations Edps/Soc 584, Psych 594 Carolyn J Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees,

More information

1. Introduction to Multivariate Analysis

1. Introduction to Multivariate Analysis 1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis 1 Principal Component Analysis Principal component analysis is a technique used to construct composite variable(s) such that composite variable(s) are weighted combination

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

STAT 501 Assignment 1 Name Spring 2005

STAT 501 Assignment 1 Name Spring 2005 STAT 50 Assignment Name Spring 005 Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Written Assignment: Due

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Latent Trait Reliability

Latent Trait Reliability Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts

More information

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R, Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations

More information

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables. Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

. = V c = V [x]v (5.1) c 1. c k

. = V c = V [x]v (5.1) c 1. c k Chapter 5 Linear Algebra It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix Because they form the foundation on which we later work,

More information

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University Experimental design Matti Hotokka Department of Physical Chemistry Åbo Akademi University Contents Elementary concepts Regression Validation Hypotesis testing ANOVA PCA, PCR, PLS Clusters, SIMCA Design

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Principal components

Principal components Principal components Principal components is a general analysis technique that has some application within regression, but has a much wider use as well. Technical Stuff We have yet to define the term covariance,

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

The Principal Component Analysis

The Principal Component Analysis The Principal Component Analysis Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) PCA Fall 2017 1 / 27 Introduction Every 80 minutes, the two Landsat satellites go around the world, recording images

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Random Vectors, Random Matrices, and Matrix Expected Value

Random Vectors, Random Matrices, and Matrix Expected Value Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 16 Random Vectors,

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

A Introduction to Matrix Algebra and the Multivariate Normal Distribution A Introduction to Matrix Algebra and the Multivariate Normal Distribution PRE 905: Multivariate Analysis Spring 2014 Lecture 6 PRE 905: Lecture 7 Matrix Algebra and the MVN Distribution Today s Class An

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

5 Inferences about a Mean Vector

5 Inferences about a Mean Vector 5 Inferences about a Mean Vector In this chapter we use the results from Chapter 2 through Chapter 4 to develop techniques for analyzing data. A large part of any analysis is concerned with inference that

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Lecture 9 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #9-8/2/2011 Slide 1 of 54 Today s Lecture Principal Components Analysis

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

A Peak to the World of Multivariate Statistical Analysis

A Peak to the World of Multivariate Statistical Analysis A Peak to the World of Multivariate Statistical Analysis Real Contents Real Real Real Why is it important to know a bit about the theory behind the methods? Real 5 10 15 20 Real 10 15 20 Figure: Multivariate

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information