An introduction to multivariate data
|
|
- Polly Blankenship
- 5 years ago
- Views:
Transcription
1 An introduction to multivariate data Angela Montanari 1 The data matrix The starting point of any analysis of multivariate data is a data matrix, i.e. a collection of n observations on a set of p characters X 1,..., X k,..., X p (they may be numeric variables, or binary variables, or suitably coded categorical variables): x x 1k... x 1p X = x i1... x ik... x ip x n1... x nk... x np The element x ik represents the value of the k-th variable on the i-th observed unit. Each row of X corresponds to an observed unit. Each column of X corresponds to an observed variable. The n statistical units can be thought of as as many points in the p-dimensional space R p. The following data matrix (24 3) contains data on length, width and height (in mm) of the carapace from 24 male painted turtles (Jolicoeur and Mosimann, 1960). There is one row per turtle, and one column per variable. Multivariate analysis is concerned either with studying the relationships between variables, or with studying the similarities between units. In the first set of methods we will consider: angela.montanari@unibo.it 1
2 Table 1: Data matrix length width height
3 - Principal component analysis - Factor analysis - Discriminant analysis In the second set we will deal with - Clustering methods. Starting from the data matrix X a series of different matrices can be derived. We will first concentrate on the matrices dealing with relationships between variables, leaving the theme of measuring dissimilarities between units to when we will deal with clustering. 2 The average vector When dealing with numeric variables, we might be interested in associating to each variable its arithmetic mean. The p means can be collected in a p dimensional vector x 1... x = x k... = x p ( ) 1 n 1 n X = 1 n X 1 n 3 The mean centered data matrix In certain applications it might be useful to express the variables as deviations from the mean. The data matrix becomes: x x 1k... x 1p X = x i1... x ik... x ip x n1... x nk... x np where x ik = x ik x k. 3
4 ( ) ( 1 X = X 1 n x = X 1 n n 1 n X = I n 1 ) n 1 n1 n X = AX where A is the so called centering matrix. A is squared n n, symmetric and idempotent. Each column of X has zero sum (zero mean). The matrix X defines a translation of the origin of the original reference system. The shape of the point cloud remains unchanged, but the origin of the axes is moved to x. 4 The standardized data matrix If one wants to eliminate the effect of different scales on the observed variables, one can resort to the standardized data matrix z z 1k... z 1p Z = z i1... z ik... z ip z n1... z nk... z np where z ik = x ik V ar(xk ) = x ik x k V ar(xk ) If we denote by D the p p diagonal matrix having the variances of the observed variables on the main diagonal, the standardized data matrix can be defined as Z = XD 1/2. Each column of Z has zero mean and unit variance. 5 The covariance matrix The covariance between two variables X k and X h is defined as: Cov (X k, X h ) = n (x ik x k )(x ih x h )/n = 1 n 4 n ( x ik x ih ) = 1 n x k x h
5 where x k and x h are the k-th and the h-th columns of X respectively. It is worth remembering that the covariance between a variable and itself is but the variance of the variable itself. Variances and covariances can then be summarized in the so called covariance matrix S: S = 1 n X X = 1 ( X 1n x ) ( X 1n x ) = n V ar(x 1 )... Cov(X 1, X k )... Cov(X 1, X p ) = Covar(X k, X 1 )... V ar(x k )... Cov(X k, X p ) = Covar(X p, X 1 )... Covar(X p, X k )... V ar(x p ) s s 1k... s 1p = s k1... s kk... s kp s p1... s pk... s pp where the diagonal elements are the variances and the off-diagonal elements are the covariances. The covariance matrix has many relevant properties: it is squared (p p); it is symmetric; it is positive semi definite; its trace is the so called total variance tr(s) = p k=1 V ar(x k). In order to have an intuition of the reason why the covariance matrix is positive semi definite consider the simple case where two variables only have been observed. Because of the symmetry property, their covariance matrix is [ ] s11 s S = 12 s 12 s 22 This matrix is positive semi definite if its determinant is greater than or equal to 0: det S = s 11 s 22 s
6 After dividing both sides of the inequality by s 11 s 22 we obtain 1 s2 12 s 11 s s 2 12 This inequality is always true as s 11 s 22 = r12 2 is the squared correlation coefficient between X 1 and X 2 which, by definition, can only take values between 0 and 1, both included. If r12 2 is equal to 1, S is positive semi definite; for all values of r12 2 other than 1, S is positive definite. 6 The correlation matrix The covariance between two standardized variables Z k and Z h is defined as: Cov (Z k, Z h ) = n (z ik z k )(z ih z h )/n = n z ik z ih /n = z k z h /n because of the zero mean property of standardized variables. If we replace z ik and z ih by their expressions as a function of the observed variables (see page 4) we obtain n n Cov (Z k, Z h ) = z ik z ih /n = (x ik x k )(x ih x h )/n = V ar(xk )V ar(x h ) = Cov(X k, X h ) V ar(xk )V ar(x h ) = r kh. This means that the covariance between two standardized variables coincides with their correlation. The correlation of a variable with itself is equal to 1, as is the variance of a standardized variable. In matrix form we have 1... r 1k... r 1p R = 1 n Z Z = n D 1/2 X XD 1/2 = D 1/2 SD 1/2 = r k r kp r p1... r pk... 1 R is the correlation matrix; it has many relevant properties: it is squared (p p); 6
7 it is symmetric; it is positive semi definite (as it is the covariance matrix of the standardized variables); all its diagonal elements are equal to 1; therefore its trace is tr(r) = p. 7 Multivariate random variables and derived linear combinations The data matrix X may be thought of as describing a sequence of n empirical realizations of a p-dimensional random vector x. In the following we will first describe multivariate statistical methods considering random vectors (i.e. at the population level) and then we will derive their sample counterpart. Let s assume we are dealing with a p-dimensional random vector x; we will denote by µ its p-dimensional expectation and by Σ its p p covariance matrix. It is worth remembering that, for mean centered random variables, Σ = E(xx ). Most of the multivariate statistical methods we will deal with in the following are based on linear combinations of the components of a random vector. We will define as y = a x such a linear combination where a is the p- dimensional vector of coefficients. Note that y is a scalar random variable. In case the interest is in more than one linear combination, say m, the vectors of coefficients will be the columns of a p m matrix A; the m linear combinations will be the components of the m-dimensional random vector y : y = A x. Linear combinations defined by an orthogonal matrix A describe an axes rotation in the multidimensional space. Again the simpler two variable case may help in understanding why. Figure 1 presents the coordinates of point P, both in the original reference system X 1, X 2 and in the new reference system Y 1, Y 2 obtained after rotating the system X 1, X 2 by an angle α. The coordinates of the point P in the original reference system are (x 1, x 2 ). The coordinates of the same point in the rotated reference system (y 1, y 2 ) 7
8 X 2 Y2 x 2 P Y 1 y 1 y 2 D x 1 X 1 Figure 1: Axes rotation can be obtained from the original ones as y 1 = x 1 cos α + x 2 sin α y 2 = x 1 sin α + x 2 cos α or, with a notation coherent with the one we used before as: y 1 = a 1 x y 2 = a 2 x where, because of the property sin 2 α + cos 2 α = 1 both a 1 and a 2 are unit norm vectors. 8
9 The rotated coordinates are therefore a linear combination of the original ones. The expected value and the variance of a single linear combination will be: and E(y) = E(a x) = a E(x) = a µ V (y) = V (a x) = a V (x)a = a Σa. The expected value and the variance of multiple linear combinations will be: and E(y) = E(A x) = A E(x) = A µ V (y) = V(A x) = A V(x)A = A ΣA. Exercise 1. Given a bi-dimensional random vector x = (x 1, x 2 ) with expected value µ = (µ 1, µ 2 ) and covariance matrix [ ] σ11 σ Σ = 12 σ 12 σ 22 consider the two linear combinations y1 = x 1 x 2 and y 2 = x 1 + x 2 and derive the expected value and the covariance (the solution will be provided in class). Exercise 2. Consider three independent standardized variables Z 1, Z 2, Z 3. Assume you transform them as follows obtaining three new variables Y 1, Y 2, Y 3 : Y 1 = Z 1 Y 2 = Y Z 2 Y 3 = 10Z 3 Derive the covariance matrix of the new Y variables. 9
10 Because of the properties of the covariance matrix, the variance of a linear combination is a positive semi definite quadratic form. It is interesting to study its properties as the vector a varies. For this purpose let s consider again the simple case consisting of two variables only. Expectation and covariance matrix are the same as in Exercise 1. After suitably performing the scalar product we obtain: V (y) = a Σa = [ ] [ ] [ ] σ a 1 a 11 σ 12 a1 2 = a 2 σ 12 σ 22 a 2 1σ a 1 a 2 σ 12 + a 2 2σ 22 If we read this variance as a function of a 1 we easily recognize, in the polynomial of degree 2, the equation of a parabola. The coefficient of a 2 1 is positive since it is a variance; the equation describes therefore a concave up parabola. The same happens if we read the variance as a function of a 2. This means that, as a varies, V (y) does never reach a finite maximum, but only a minimum. 10
Principal component analysis
Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and
More informationEcon Slides from Lecture 8
Econ 205 Sobel Econ 205 - Slides from Lecture 8 Joel Sobel September 1, 2010 Computational Facts 1. det AB = det BA = det A det B 2. If D is a diagonal matrix, then det D is equal to the product of its
More informationPrincipal Components Theory Notes
Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationNotes on Linear Algebra and Matrix Theory
Massimo Franceschet featuring Enrico Bozzo Scalar product The scalar product (a.k.a. dot product or inner product) of two real vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) is not a vector but a
More informationRandom Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30
Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices
More informationAn Introduction to Multivariate Methods
Chapter 12 An Introduction to Multivariate Methods Multivariate statistical methods are used to display, analyze, and describe data on two or more features or variables simultaneously. I will discuss multivariate
More informationVectors and Matrices Statistics with Vectors and Matrices
Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More information1. Introduction to Multivariate Analysis
1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with
More informationProperties of Summation Operator
Econ 325 Section 003/004 Notes on Variance, Covariance, and Summation Operator By Hiro Kasahara Properties of Summation Operator For a sequence of the values {x 1, x 2,..., x n, we write the sum of x 1,
More informationAlgebra 1 Khan Academy Video Correlations By SpringBoard Activity and Learning Target
Algebra 1 Khan Academy Video Correlations By SpringBoard Activity and Learning Target SB Activity Activity 1 Investigating Patterns 1-1 Learning Targets: Identify patterns in data. Use tables, graphs,
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationStatistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16
Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent
More informationStat 206: Sampling theory, sample moments, mahalanobis
Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationSample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson
Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring
More informationMore Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson
More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois
More informationLecture 6: Selection on Multiple Traits
Lecture 6: Selection on Multiple Traits Bruce Walsh lecture notes Introduction to Quantitative Genetics SISG, Seattle 16 18 July 2018 1 Genetic vs. Phenotypic correlations Within an individual, trait values
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationMatrix Algebra Determinant, Inverse matrix. Matrices. A. Fabretti. Mathematics 2 A.Y. 2015/2016. A. Fabretti Matrices
Matrices A. Fabretti Mathematics 2 A.Y. 2015/2016 Table of contents Matrix Algebra Determinant Inverse Matrix Introduction A matrix is a rectangular array of numbers. The size of a matrix is indicated
More informationLinear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.
Homework # due Thursday, Oct. 0. Show that the diagonals of a square are orthogonal to one another. Hint: Place the vertices of the square along the axes and then introduce coordinates. 2. Find the equation
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationPrincipal Component Analysis
Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental
More informationL3: Review of linear algebra and MATLAB
L3: Review of linear algebra and MATLAB Vector and matrix notation Vectors Matrices Vector spaces Linear transformations Eigenvalues and eigenvectors MATLAB primer CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna
More information2. Matrix Algebra and Random Vectors
2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns
More informationWeek Quadratic forms. Principal axes theorem. Text reference: this material corresponds to parts of sections 5.5, 8.2,
Math 051 W008 Margo Kondratieva Week 10-11 Quadratic forms Principal axes theorem Text reference: this material corresponds to parts of sections 55, 8, 83 89 Section 41 Motivation and introduction Consider
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationThe Hilbert Space of Random Variables
The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2
More informationChapter 6. Eigenvalues. Josef Leydold Mathematical Methods WS 2018/19 6 Eigenvalues 1 / 45
Chapter 6 Eigenvalues Josef Leydold Mathematical Methods WS 2018/19 6 Eigenvalues 1 / 45 Closed Leontief Model In a closed Leontief input-output-model consumption and production coincide, i.e. V x = x
More informationThe coordinates of the vertex of the corresponding parabola are p, q. If a > 0, the parabola opens upward. If a < 0, the parabola opens downward.
Mathematics 10 Page 1 of 8 Quadratic Relations in Vertex Form The expression y ax p q defines a quadratic relation in form. The coordinates of the of the corresponding parabola are p, q. If a > 0, the
More informationCS 143 Linear Algebra Review
CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More information. a m1 a mn. a 1 a 2 a = a n
Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by
More informationFinal Exam Practice Problems Answers Math 24 Winter 2012
Final Exam Practice Problems Answers Math 4 Winter 0 () The Jordan product of two n n matrices is defined as A B = (AB + BA), where the products inside the parentheses are standard matrix product. Is the
More informationTHE UNIVERSITY OF HONG KONG DEPARTMENT OF MATHEMATICS
THE UNIVERSITY OF HONG KONG DEPARTMENT OF MATHEMATICS MATH853: Linear Algebra, Probability and Statistics May 5, 05 9:30a.m. :30p.m. Only approved calculators as announced by the Examinations Secretary
More informationRecall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.
Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More informationExercises * on Principal Component Analysis
Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................
More informationDot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.
Dot Products K. Behrend April 3, 008 Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Contents The dot product 3. Length of a vector........................
More informationLinear Algebra: Characteristic Value Problem
Linear Algebra: Characteristic Value Problem . The Characteristic Value Problem Let < be the set of real numbers and { be the set of complex numbers. Given an n n real matrix A; does there exist a number
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationEcon 2120: Section 2
Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted
More informationElements of Probability Theory
Short Guides to Microeconometrics Fall 2016 Kurt Schmidheiny Unversität Basel Elements of Probability Theory Contents 1 Random Variables and Distributions 2 1.1 Univariate Random Variables and Distributions......
More informationUnconstrained Ordination
Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)
More information1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )
Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical
More informationPrincipal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,
Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationStat 216 Final Solutions
Stat 16 Final Solutions Name: 5/3/05 Problem 1. (5 pts) In a study of size and shape relationships for painted turtles, Jolicoeur and Mosimann measured carapace length, width, and height. Their data suggest
More informationAlgebra 2 Khan Academy Video Correlations By SpringBoard Activity
SB Activity Activity 1 Creating Equations 1-1 Learning Targets: Create an equation in one variable from a real-world context. Solve an equation in one variable. 1-2 Learning Targets: Create equations in
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationMATH 583A REVIEW SESSION #1
MATH 583A REVIEW SESSION #1 BOJAN DURICKOVIC 1. Vector Spaces Very quick review of the basic linear algebra concepts (see any linear algebra textbook): (finite dimensional) vector space (or linear space),
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationAlgebra 2 Khan Academy Video Correlations By SpringBoard Activity
SB Activity Activity 1 Creating Equations 1-1 Learning Targets: Create an equation in one variable from a real-world context. Solve an equation in one variable. 1-2 Learning Targets: Create equations in
More informationPrentice Hall Mathematics, Algebra Correlated to: Achieve American Diploma Project Algebra II End-of-Course Exam Content Standards
Core: Operations on Numbers and Expressions Priority: 15% Successful students will be able to perform operations with rational, real, and complex numbers, using both numeric and algebraic expressions,
More informationIntelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham
Intelligent Data Analysis Principal Component Analysis Peter Tiňo School of Computer Science University of Birmingham Discovering low-dimensional spatial layout in higher dimensional spaces - 1-D/3-D example
More informationRevision: Chapter 1-6. Applied Multivariate Statistics Spring 2012
Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing
More informationLinear Algebra Review
Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and
More informationLinear Algebra and Eigenproblems
Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details
More informationGaussian Elimination and Back Substitution
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 4 Notes These notes correspond to Sections 31 and 32 in the text Gaussian Elimination and Back Substitution The basic idea behind methods for solving
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationLinear Algebra: Matrix Eigenvalue Problems
CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given
More informationAlgebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions.
Standard 1: Relations and Functions Students graph relations and functions and find zeros. They use function notation and combine functions by composition. They interpret functions in given situations.
More informationNeed for Several Predictor Variables
Multiple regression One of the most widely used tools in statistical analysis Matrix expressions for multiple regression are the same as for simple linear regression Need for Several Predictor Variables
More informationMultivariate Analysis of Variance
Chapter 15 Multivariate Analysis of Variance Jolicouer and Mosimann studied the relationship between the size and shape of painted turtles. The table below gives the length, width, and height (all in mm)
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationPreliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38
Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More information22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes
Repeated Eigenvalues and Symmetric Matrices. Introduction In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions. Firstly we look at matrices where one
More informationComputing Science Group STABILITY OF THE MAHALANOBIS DISTANCE: A TECHNICAL NOTE. Andrew D. Ker CS-RR-10-20
Computing Science Group STABILITY OF THE MAHALANOBIS DISTANCE: A TECHNICAL NOTE Andrew D. Ker CS-RR-10-20 Oxford University Computing Laboratory Wolfson Building, Parks Road, Oxford OX1 3QD Abstract When
More informationChapter 1. The Noble Eightfold Path to Linear Regression
Chapter 1 The Noble Eightfold Path to Linear Regression In this chapter, I show several di erent ways of solving the linear regression problem. The di erent approaches are interesting in their own way.
More informationANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3
ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3 ISSUED 24 FEBRUARY 2018 1 Gaussian elimination Let A be an (m n)-matrix Consider the following row operations on A (1) Swap the positions any
More informationMatrices and Deformation
ES 111 Mathematical Methods in the Earth Sciences Matrices and Deformation Lecture Outline 13 - Thurs 9th Nov 2017 Strain Ellipse and Eigenvectors One way of thinking about a matrix is that it operates
More informationRepeated Eigenvalues and Symmetric Matrices
Repeated Eigenvalues and Symmetric Matrices. Introduction In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions. Firstly we look at matrices where one
More informationEconomics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean.
Economics 573 Problem Set 5 Fall 00 Due: 4 October 00 1. In random sampling from any population with E(X) = and Var(X) =, show (using Chebyshev's inequality) that sample mean converges in probability to..
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationMath 1302 Notes 2. How many solutions? What type of solution in the real number system? What kind of equation is it?
Math 1302 Notes 2 We know that x 2 + 4 = 0 has How many solutions? What type of solution in the real number system? What kind of equation is it? What happens if we enlarge our current system? Remember
More informationTC08 / 6. Hadamard codes SX
TC8 / 6. Hadamard codes 3.2.7 SX Hadamard matrices Hadamard matrices. Paley s construction of Hadamard matrices Hadamard codes. Decoding Hadamard codes A Hadamard matrix of order is a matrix of type whose
More informationMatrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.
Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Converting Matrices Into (Long) Vectors Convention:
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More informationGaussian random variables inr n
Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b
More informationThe degree of a function is the highest exponent in the expression
L1 1.1 Power Functions Lesson MHF4U Jensen Things to Remember About Functions A relation is a function if for every x-value there is only 1 corresponding y-value. The graph of a relation represents a function
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More information1 Principal component analysis and dimensional reduction
Linear Algebra Working Group :: Day 3 Note: All vector spaces will be finite-dimensional vector spaces over the field R. 1 Principal component analysis and dimensional reduction Definition 1.1. Given an
More informationCS 246 Review of Linear Algebra 01/17/19
1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector
More informationMath 313 Chapter 1 Review
Math 313 Chapter 1 Review Howard Anton, 9th Edition May 2010 Do NOT write on me! Contents 1 1.1 Introduction to Systems of Linear Equations 2 2 1.2 Gaussian Elimination 3 3 1.3 Matrices and Matrix Operations
More informationCommon-Knowledge / Cheat Sheet
CSE 521: Design and Analysis of Algorithms I Fall 2018 Common-Knowledge / Cheat Sheet 1 Randomized Algorithm Expectation: For a random variable X with domain, the discrete set S, E [X] = s S P [X = s]
More informationSection 7.3: SYMMETRIC MATRICES AND ORTHOGONAL DIAGONALIZATION
Section 7.3: SYMMETRIC MATRICES AND ORTHOGONAL DIAGONALIZATION When you are done with your homework you should be able to Recognize, and apply properties of, symmetric matrices Recognize, and apply properties
More informationSelection on Multiple Traits
Selection on Multiple Traits Bruce Walsh lecture notes Uppsala EQG 2012 course version 7 Feb 2012 Detailed reading: Chapter 30 Genetic vs. Phenotypic correlations Within an individual, trait values can
More information