TMA4267 Linear Statistical Models V2017 [L3]

Size: px
Start display at page:

Download "TMA4267 Linear Statistical Models V2017 [L3]"

Transcription

1 TMA4267 Linear Statistical Models V2017 [L3] Part 1: Multivariate RVs and normal distribution (L3) Covariance and positive definiteness [H:2.2,2.3,3.3], Principal components [H ] Quiz with Kahoot! Mette Langaas Department of Mathematical Sciences, NTNU To be lectured: January 17, / 26

2 Previously X (p 1) : random vector, described by joint probability distribution function (pdf) f (x) and cumulative distribution function (cdf) F (x), or, as we will see (in L4), by the (multivariate) moment generating function (MGF) M X (t) = E(e tt X ). Important aspects: moments. E(X ): mean of a random vector (or matrix) is found as the mean of each element. Cov(X ) = E((X µ)(x µ) T ): p p variance-covariance matrix, with variances on the diagonal and covariance off-diagonal, real, symmetric. Rules for vector of linear combinations CX : E(CX ) = Cµ and Cov(CX ) = CΣC T. 1 / 26

3 Today! Requirements and properties of Σ = Cov(X ): symmetric, positive definite (SPD), via spectral decomposition (eigenvalues/eigenvectors). The square root matrix. Linear combinations with maximal variability: principal components are linear combinations made from eigenvectors. PCA-plots. Kahoot! on what we have worked with so far. Next lecture: move on to multivariate normal data! 2 / 26

4 Drinking habits data set Coffee Tea Cocoa Liquer Wine Beer Norway Danmark Finland Iceland Sweden France Ireland Italy Jugoslavia The Netherlands Poland Portugal Soviet Union Spain Schweitz Great Britain Chech Repl Germany Hungary Austria New Zealand / 26

5 The variance-covariance matrix and positive definiteness X (p 1) with symmetric Σ = Cov(X ) Want variance of linear combination to be positive: c T Σc for all c 0, which means that Σ needs to positive definite. Write Σ > 0. This is true if all eigenvalues of Σ are positive (eigenvalues of symmetric matrix are real). Spectral decompositions (diagonalization): Σ = PΛP T. Square root matrix defined from spectral decomposition: Σ 1 2 = PΛ 1 2 P T. 4 / 26

6 Drinking habits Coffee Tea Cocoa Liquer Wine Beer Norway Danmark Finland Iceland Sweden France Ireland Italy Jugoslavia The Netherlands Poland Portugal Soviet Union Spain Schweitz Great Britain Chech Repl Germany Hungary Austria New Zealand / 26

7 Estimators for µ and Σ [H3.3] X 1, X 2,..., X n i.i.d E(X ) = µ and Cov(X ). X = 1 n n j=1 S 2 = 1 n 1 X j n (X j X )(X j X ) T j=1 are two commonly used estimators for the mean and covariance matrix. RecEx1.P7: we may write S 2 using a centering matrix. 6 / 26

8 Drinking habits data set > drinkcov Coffee Tea Cocoa Liquer Wine Beer Coffee Tea Cocoa Liquer Wine Beer / 26

9 Correlation plot Coffee Tea Cocoa Liquer Wine Beer 1 Coffee 0.8 Tea Cocoa Liquer 0.2 Wine Beer / 26

10 prcomp drink <- read.csv("drikke.txt",sep=",",header=true) drink <- na.omit(drink) # remove missing data # now for PCA pca <- prcomp(drink,scale=true) # scale: variables are scaled, and automatically center=true > names(pca) [1] "sdev" "rotation" "center" "scale" "x" > pca$rotation # the loadings PC1 PC2 PC3 PC4 PC5 PC6 Coffee Tea Cocoa Liquer Wine Beer > s <- cor(drink) # cor, not cov, since covariates are scaled > eigen(s) # same as pca$rotations - opposite sign of some vectors $values [1] $vectors [,1] [,2] [,3] [,4] [,5] [,6] [1,] [2,] [3,] [4,] [5,] [6,] / 26

11 Principal components Let Σ be the covariance matrix associated with the random vector X p 1. The covariance matrix has the eigenvalue-vector pairs (λ j, e j ), where λ 1 λ 2 λ p 0. The mth principal component is given by Y m = e T mx = e m1 X 1 + e m2 X e mp X p and has Var(Y m ) = e T mσe m = λ m, i = 1, 2,..., p Cov(Y i, Y m ) = e T i Σe m = 0 i m 10 / 26

12 Principal components: idea 1. We choose principal component 1, PC 1 = c T 1 X, to have maximal variance max Var(c T c 1 0,c T 1 c 1 X ) 1=1 2. We choose principal component 2, PC 2 = c T 2 X, to have maximal variance and to be uncorrelated with PC 1. max Var(c T c 2 0,c T 2 c 2 X ) and c T 1 Σc 2 = 0 2=1 11 / 26

13 Principal components: idea 3. We choose principal component 3, PC 1 = c T 3 X, to have maximal variance and be uncorrelated with PC 1 and PC 2. for i = 1, and so on. max Var(c T c 3 0,c T 3 c 3 X ) and c T i Σc 3 = 0 3=1 It can be shown that choosing c i = e i (ith eigenvector of Σ) fulfills these requirements. 12 / 26

14 Principal component scores > pca$x PC1 PC2 PC3 PC4 PC5 PC6 Norway Danmark Finland Iceland Sweden France Ireland Italy Jugoslavia The Netherlands Poland Portugal Soviet Union Spain Schweitz Great Britain Chech Repl Germany Hungary Austria New Zealand / 26

15 Scores PCA1 vs PCA2 The Netherlands pca$x[, 2] Italy France Germany Austria Danmark Schweitz Norway Sweden Finland Portugal Iceland Spain Chech Repl Jugoslavia Hungary Soviet Union Poland New Zealand Great Britain Ireland pca$x[, 1] 14 / 26

16 Scores PCA1 vs PCA3 Italy Portugal Ireland pca$x[, 3] France New Zealand Soviet Union Jugoslavia Schweitz Norway Austria Danmark Sweden Spain Iceland Germany The Netherlands Chech Repl Poland Finland Great Britain Hungary pca$x[, 1] 15 / 26

17 Scores PCA2 vs PCA3 Italy Ireland Portugal pca$x[, 3] Poland Soviet Union Great Britain New Zealand France Jugoslavia Spain Iceland Chech Repl Schweitz Norway Austria Danmark Sweden Germany Finland The Netherlan Hungary pca$x[, 2] 16 / 26

18 Scores PCA1 vs PCA2 with biplot PC2 Wine Italy France The Netherlands Coffee Cocoa Germany Austria Beer Danmark Schweitz Norway Sweden Finland Portugal Iceland Spain Chech Repl New Zealand Jugoslavia Liquer Hungary Soviet Union Poland Great Britain Tea Ireland PC1 17 / 26

19 Biplots PC3 Italy Portugal Irelan Wine New Zealand TeaGreat Britain France Soviet Union Jugoslavia Schweitz Norway Austria Danmark Sweden Cocoa Spain Iceland Coffee Germany The Netherlands Beer Chech Repl Finland Poland Hungary Liquer PC3 Italy Portugal Ireland New Great Zealand Tea Britain Wine Soviet Union France Jugoslavia Norway Schweitz Danmark Austria Spain Iceland Sweden Cocoa Beer Germany Coffee The Nethe Chech Repl Poland Finland Hungary Liquer PC1 PC PC4 Norway Iceland Coffee Sweden Finland Danmark Soviet Union Jugoslavia Austria New Zealand Poland Tea Irelan Portugal Great Britain Italy Schweitz Germany Spain Chech Repl Liquer The Netherlands Beer Hungary Cocoa France Wine PC5 Beer Danmark Austria Germany Chech New Repl Zealand Wine Portugal Coffee Finland Hungary Spain Schweitz Sweden Great Britain France Norway Irelan Italy Jugoslavia Liquer Tea Iceland Soviet PolandUnion The Cocoa Netherlands PC1 PC1 18 / 26

20 Proportion of total population variance Total population variance: p j=1 Var(X j) = trσ = p j=1 λ j = p j=1 Var(Z j). Proportion of total population variance explained by PC m: λ m p j=1 λ j Proportion of total population variance explained by the first m PCs: m j=1 λ j p j=1 λ j 19 / 26

21 How many PCs are needed? Dependent on: The proportion of the total sample variance that we would like to explain. 80%? More? Look at the eigenvalues; small eigenvalues may be an evidence of collinearity problems. 20 / 26

22 Importance of components > summary(pca) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6 Standard deviation Proportion of Variance Cumulative Proportion > eigen(s) $values [1] > sqrt(eigen(s)$values) [1] / 26

23 Screeplot pca Variances / 26

24 PC from standardized variables X can be standardized to have mean 0 and unit variances. X = V 1 2 (X µ) Principal components made from standardized variables will be based on the eigenvalues and eigenvectors of the correlation matrix ρ = V 1 2 ΣV 1 2. Achilles heel: Since Σ and ρ do not have the same eigenvectors/eigenvalues, the principal components made from Σ and ρ will not be the same. Unless we have a good reason to compare the variances for the different X j s we should make PCs from the standardized variables. For standardized variables p j=1 Var(X j ) = p, and Proportion of total population variance explained by PC m: λ mp. 23 / 26

25 Principal components from singular value decomposition The singular value decomposistion of a (data) matrix X n p is given by: X n p = U n p D p p V T p p where the columns of U are the eigenvectors of X X T D is a diagonal matrix with singular values on the diagonal, i.e. the square root of the eigenvalues of X X T and X T X (they have the same eigenvalues). the columns of V are the eigenvectors of X T X. And, the principal components (scores) of the data are defined as the columns of Z = X V = UD 24 / 26

26 PCR: summary PCA finds linear combinations Y that best represents the X. The PCs are found in an unsupervised way. The "truth" is not known. A plot of PC1 vs PC2 is often used to see if there is separation (subgroups in the data). The principal component loadings are often given interpretation (overall consumption, PCA can be combined with linear regression. 25 / 26

27 Quiz with Kahoot! at kahoot.it. - based on what we have gone through so far! 26 / 26

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Multivariate Analysis

Multivariate Analysis Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

The Multivariate Normal Distribution. In this case according to our theorem

The Multivariate Normal Distribution. In this case according to our theorem The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are

More information

ECON Fundamentals of Probability

ECON Fundamentals of Probability ECON 351 - Fundamentals of Probability Maggie Jones 1 / 32 Random Variables A random variable is one that takes on numerical values, i.e. numerical summary of a random outcome e.g., prices, total GDP,

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham Intelligent Data Analysis Principal Component Analysis Peter Tiňo School of Computer Science University of Birmingham Discovering low-dimensional spatial layout in higher dimensional spaces - 1-D/3-D example

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Lecture 15: Multivariate normal distributions

Lecture 15: Multivariate normal distributions Lecture 15: Multivariate normal distributions Normal distributions with singular covariance matrices Consider an n-dimensional X N(µ,Σ) with a positive definite Σ and a fixed k n matrix A that is not of

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = = 18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013 1. Consider a bivariate random variable: [ ] X X = 1 X 2 with mean and co [ ] variance: [ ] [ α1 Σ 1,1 Σ 1,2 σ 2 ρσ 1 σ E[X] =, and Cov[X]

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Gaussian random variables inr n

Gaussian random variables inr n Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b

More information

3.1. The probabilistic view of the principal component analysis.

3.1. The probabilistic view of the principal component analysis. 301 Chapter 3 Principal Components and Statistical Factor Models This chapter of introduces the principal component analysis (PCA), briefly reviews statistical factor models PCA is among the most popular

More information

Lecture 14: Multivariate mgf s and chf s

Lecture 14: Multivariate mgf s and chf s Lecture 14: Multivariate mgf s and chf s Multivariate mgf and chf For an n-dimensional random vector X, its mgf is defined as M X (t) = E(e t X ), t R n and its chf is defined as φ X (t) = E(e ıt X ),

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B) REVIEW OF MAIN CONCEPTS AND FORMULAS Boolean algebra of events (subsets of a sample space) DeMorgan s formula: A B = Ā B A B = Ā B The notion of conditional probability, and of mutual independence of two

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

Chapter 6 Scatterplots, Association and Correlation

Chapter 6 Scatterplots, Association and Correlation Chapter 6 Scatterplots, Association and Correlation Looking for Correlation Example Does the number of hours you watch TV per week impact your average grade in a class? Hours 12 10 5 3 15 16 8 Grade 70

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

A Markov system analysis application on labour market dynamics: The case of Greece

A Markov system analysis application on labour market dynamics: The case of Greece + A Markov system analysis application on labour market dynamics: The case of Greece Maria Symeonaki Glykeria Stamatopoulou This project has received funding from the European Union s Horizon 2020 research

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18 Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Principal Component Analysis (PCA) The problem in exploratory multivariate data analysis usually is

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Clusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Clusters. Unsupervised Learning. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance Covariance Lecture 0: Covariance / Correlation & General Bivariate Normal Sta30 / Mth 30 We have previously discussed Covariance in relation to the variance of the sum of two random variables Review Lecture

More information

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Exam 2. Jeremy Morris. March 23, 2006

Exam 2. Jeremy Morris. March 23, 2006 Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

The Information Content of Capacity Utilisation Rates for Output Gap Estimates

The Information Content of Capacity Utilisation Rates for Output Gap Estimates The Information Content of Capacity Utilisation Rates for Output Gap Estimates Michael Graff and Jan-Egbert Sturm 15 November 2010 Overview Introduction and motivation Data Output gap data: OECD Economic

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Section 8.1. Vector Notation

Section 8.1. Vector Notation Section 8.1 Vector Notation Definition 8.1 Random Vector A random vector is a column vector X = [ X 1 ]. X n Each Xi is a random variable. Definition 8.2 Vector Sample Value A sample value of a random

More information

Principal Component Analysis Utilizing R and SAS Software s

Principal Component Analysis Utilizing R and SAS Software s International Journal of Current Microbiology and Applied Sciences ISSN: 2319-7706 Volume 7 Number 05 (2018) Journal homepage: http://www.ijcmas.com Original Research Article https://doi.org/10.20546/ijcmas.2018.705.441

More information

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline. Random Variables Amappingthattransformstheeventstotherealline. Example 1. Toss a fair coin. Define a random variable X where X is 1 if head appears and X is if tail appears. P (X =)=1/2 P (X =1)=1/2 Example

More information

PIRLS 2011 The PIRLS 2011 Teacher Working Conditions Scale

PIRLS 2011 The PIRLS 2011 Teacher Working Conditions Scale PIRLS 2011 The PIRLS 2011 Teacher Working Conditions Scale The Teacher Working Conditions (TWC) scale was created based on teachers responses concerning five potential problem areas described below. See

More information

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables. Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability

More information

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES Strasbourg, 29 May 2015 PC-GR-COT (2013) 2 EN_Rev AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES TO THE UNITED NATIONS CONVENTION

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale

PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale The Students Motivated to Read (SMR) scale was created based on students degree of agreement with six statements described below. See Creating

More information

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa 1 Outline Focus of the study Data Dispersion and forecast errors during turning points Testing efficiency

More information

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT Pursuant to Article 1 of the Convention signed in Paris on 14th December 1960, and which came into force on 30th September 1961, the Organisation

More information

Trends in Human Development Index of European Union

Trends in Human Development Index of European Union Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development

More information

PIRLS 2011 The PIRLS 2011 Students Engaged in Reading Lessons Scale

PIRLS 2011 The PIRLS 2011 Students Engaged in Reading Lessons Scale PIRLS 2011 The PIRLS 2011 Students Engaged in Reading Lessons Scale The Students Engaged in Reading Lessons (ERL) scale was created based on students degree of agreement with seven statements described

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

PIRLS 2011 The PIRLS 2011 Teacher Career Satisfaction Scale

PIRLS 2011 The PIRLS 2011 Teacher Career Satisfaction Scale PIRLS 2011 The PIRLS 2011 Teacher Career Satisfaction Scale The Teacher Career Satisfaction (TCS) scale was created based on teachers degree of agreement with six statements described below. See Creating

More information

Notes on Implementation of Component Analysis Techniques

Notes on Implementation of Component Analysis Techniques Notes on Implementation of Component Analysis Techniques Dr. Stefanos Zafeiriou January 205 Computing Principal Component Analysis Assume that we have a matrix of centered data observations X = [x µ,...,

More information

4. Distributions of Functions of Random Variables

4. Distributions of Functions of Random Variables 4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Probability Lecture III (August, 2006)

Probability Lecture III (August, 2006) robability Lecture III (August, 2006) 1 Some roperties of Random Vectors and Matrices We generalize univariate notions in this section. Definition 1 Let U = U ij k l, a matrix of random variables. Suppose

More information

Review: mostly probability and some statistics

Review: mostly probability and some statistics Review: mostly probability and some statistics C2 1 Content robability (should know already) Axioms and properties Conditional probability and independence Law of Total probability and Bayes theorem Random

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Test Problems for Probability Theory ,

Test Problems for Probability Theory , 1 Test Problems for Probability Theory 01-06-16, 010-1-14 1. Write down the following probability density functions and compute their moment generating functions. (a) Binomial distribution with mean 30

More information

Factor Analysis Continued. Psy 524 Ainsworth

Factor Analysis Continued. Psy 524 Ainsworth Factor Analysis Continued Psy 524 Ainsworth Equations Extraction Principal Axis Factoring Variables Skiers Cost Lift Depth Powder S1 32 64 65 67 S2 61 37 62 65 S3 59 40 45 43 S4 36 62 34 35 S5 62 46 43

More information

STATISTICA MULTIVARIATA 2

STATISTICA MULTIVARIATA 2 1 / 73 STATISTICA MULTIVARIATA 2 Fabio Rapallo Dipartimento di Scienze e Innovazione Tecnologica Università del Piemonte Orientale, Alessandria (Italy) fabio.rapallo@uniupo.it Alessandria, May 2016 2 /

More information

TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade

TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade The Teacher Career Satisfaction (TCS) scale was created based on teachers degree of agreement with the six statements described

More information

Stat 8053, Chapter 9, rev March 7, 2014

Stat 8053, Chapter 9, rev March 7, 2014 Stat 8053, Chapter 9, rev March 7, 2014 Principal Components with p = 2 Suppose x is 2 1 with mean µ and covariance matrix Σ. Case I Suppose Σ is in correlation form, Σ(1, ρ) = ( 1 ρ ρ 1 If ρ 0, then the

More information

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON Introductory Econometrics. Lecture 13: Internal and external validity ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales

Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales 1 Table A.1 The Eurobarometer Surveys The Eurobarometer surveys are the products of a unique program

More information

Chapter 4. Multivariate Distributions. Obviously, the marginal distributions may be obtained easily from the joint distribution:

Chapter 4. Multivariate Distributions. Obviously, the marginal distributions may be obtained easily from the joint distribution: 4.1 Bivariate Distributions. Chapter 4. Multivariate Distributions For a pair r.v.s (X,Y ), the Joint CDF is defined as F X,Y (x, y ) = P (X x,y y ). Obviously, the marginal distributions may be obtained

More information

TIMSS 2011 The TIMSS 2011 School Discipline and Safety Scale, Fourth Grade

TIMSS 2011 The TIMSS 2011 School Discipline and Safety Scale, Fourth Grade TIMSS 2011 The TIMSS 2011 School Discipline and Safety Scale, Fourth Grade The School Discipline and Safety (DAS) scale was created based on principals responses to the ten statements described below.

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution Moment Generating Function STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution T. Linder Queen s University Winter 07 Definition Let X (X,...,X n ) T be a random vector and

More information

Multivariate Distributions (Hogg Chapter Two)

Multivariate Distributions (Hogg Chapter Two) Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous

More information

TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade

TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade The Instruction to Engage Students in Learning (IES) scale was created based on teachers responses to how often

More information

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v } Statistics 35 Probability I Fall 6 (63 Final Exam Solutions Instructor: Michael Kozdron (a Solving for X and Y gives X UV and Y V UV, so that the Jacobian of this transformation is x x u v J y y v u v

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Principal Analysis Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board

More information

18 Bivariate normal distribution I

18 Bivariate normal distribution I 8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer

More information