Principal Dynamical Components
|
|
- Horatio Haynes
- 5 years ago
- Views:
Transcription
1 Principal Dynamical Components Manuel D. de la Iglesia* Departamento de Análisis Matemático, Universidad de Sevilla Instituto Nacional de Matemática Pura e Aplicada (IMPA) Rio de Janeiro, May 14, 2013 *Joint work with Esteban G. Tabak, Courant Institute
2 OUTLINE
3 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p))
4 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p)) 2. Principal dynamical components (PDC)
5 OUTLINE 1. Principal component analysis (PCA) and Autorregresive models (AR(p)) 2. Principal dynamical components (PDC) 3. Application to the Global Sea-Surface Temperature Field
6 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest.
7 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set.
8 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables (m <n), the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables.
9 PRINCIPAL COMPONENT ANALYSIS (PCA) Suppose that z 2 R n is a vector of random variables, and that the variances of the n random variables and the structure of the covariances between the n variables are of interest. Given N independent observations z 1,z 2,...,z N 2 R n (n<<n) the main goal of PCA is to reduce the dimensionality of a data set while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables (m <n), the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables.
10 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects.
11 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects. The analysis of n = 11 socio-economic indicators of N = 96 countries revealed that all the information could be explained by two only PCs. The first one was related with the gross domestic product of the country while the second was the rural condition.
12 A COUPLE OF EXAMPLES Certain analysis considered the grades of N = 15 students in n =8 subjects. The first two PCs account for 82,1% of the total variation in the data set. The first one was strongly correlated with humanity subjects and the second one with science subjects. The analysis of n = 11 socio-economic indicators of N = 96 countries revealed that all the information could be explained by two only PCs. The first one was related with the gross domestic product of the country while the second was the rural condition. 1901
13 HOW TO OBTAIN THE PC S Singular value decomposition Given a data set z 1,z 2,...,z N 2 R n (already subtracted the mean value), the first m PCs are given by x j = Q 0 xz j where Q x 2 R n m has orthogonal columns such that the predictive uncertainty NX j=1 kz j Q x x j k 2 is minimal.
14 HOW TO OBTAIN THE PC S Singular value decomposition Given a data set z 1,z 2,...,z N 2 R n (already subtracted the mean value), the first m PCs are given by x j = Q 0 xz j where Q x 2 R n m has orthogonal columns such that the predictive uncertainty NX j=1 kz j Q x x j k 2 is minimal. The matrix Q x consists of the first m columns of U in the singular value decomposition Z 0 = USV 0 where U 2 R n n and V 2 R N N are orthogonal matrices and S 2 R n N is diagonal with the eigenvalues of the covariance matrix Z 0 Z sorted in decreasing order (Z =[z 1 z N ]).
15 AUTORREGRESIVE MODELS (AR(P)) One dimensional AR(p) For a random process z the AR(p) model is defined as z j = b + px i=1 a i z j i + " j, where a i,...,a p are the parameters of the model, b is a constant, and " j is white noise. The process is stationary if the roots of the polynomial x p P p i=1 a ix p i lie within the unit circle.
16 AUTORREGRESIVE MODELS (AR(P)) One dimensional AR(p) For a random process z the AR(p) model is defined as z j = b + px i=1 a i z j i + " j, where a i,...,a p are the parameters of the model, b is a constant, and " j is white noise. The process is stationary if the roots of the polynomial x p P p i=1 a ix p i lie within the unit circle. a i,...,a p can be estimated by solving the Yule-Walker equations where p 1 C A = p 1 p 2 p C B m = E[z j+m z j ] are the covariance functions. a 1 a 2. a p 1 C A, 1931
17 SOME EXAMPLES OF AR(1) AND AR(2)
18 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise.
19 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise. Now the process is stationary if the roots of x p I n px i=1 A i x p i =0 lie within the unit circle.
20 VECTOR AR(P) For a vector-valued random process z 2 R n the AR(p) model is defined as z j = b + px i=1 A i z j i + " j, where A i,...,a p 2 R n n are the parameters of the model, b is a constant vector, and " j is a vector of white noise. Now the process is stationary if the roots of x p I n px i=1 A i x p i =0 lie within the unit circle. Similarly an estimation of the parameters A i,...,a p can be calculated solving the Yule-Walker equations (now block matrices).
21 PRINCIPAL DYNAMICAL COMPONENTS (PDC) Goal: Mix PCA and AR(p).
22 PRINCIPAL DYNAMICAL COMPONENTS (PDC) Goal: Mix PCA and AR(p). Given a time series z j 2 R n with transition probability T (z j+1 z j ) (Markovian), PDC considers a dimensional reduction of T (z j+1 z j ) in the following way where T (z j+1 z j )=J(z j+1 )e(y j+1 x j+1 )d(x j+1 x j ), x = P x (z(x, y)) 2 R m,y = P y (z(x, y)) 2 R n m (P x and P y are projection operators) and J(z) is the Jacobian determinant of the coordinate map z! (x, y). e(y x) is a probabilistic embedding. d(x j+1 x j )isareduceddynamical model.
23 PRINCIPAL DYNAMICAL COMPONENTS (PDC) We will focus on the case where P x and P y are orthogonal projections (therefore J(z) = 1) and the embedding and reduced dynamics are given by isotropic Gaussians e(y x) =N (0, 2 I n m ), d(x j+1 x j )=N (Ax j, 2 I m ).
24 PRINCIPAL DYNAMICAL COMPONENTS (PDC) We will focus on the case where P x and P y are orthogonal projections (therefore J(z) = 1) and the embedding and reduced dynamics are given by isotropic Gaussians e(y x) =N (0, 2 I n m ), d(x j+1 x j )=N (Ax j, 2 I m ). Therefore, the log-likelihood function is given by L = NX 1 j=1 apple n 2 log(2 )+n log( ) kx j+1 Ax j k 2 + ky j+1 k 2. Maximazing L over P and A is equivalent to minimizing the cost function c = 1 N 1 NX 1 j=1 kx j+1 Ax j k 2 + ky j+1 k 2.
25 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y
26 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y At the same time we look for dynamics in the submanifold of the form x j+1 = Ax j.
27 LINEAR AND AUTONOMOUS CASE (MARKOVIAN) For a time series z 2 R n we look for a m-dimensional submanifold x = Q 0 xz such that x z =[Q x Q y ] y At the same time we look for dynamics in the submanifold of the form x j+1 = Ax j. The minimization problem that defines Q =[Q x Q y ] and A is N min c = X1 Q,A j=1 z j+1 Q AQx 0 z j 0 2 NX 1 = j=1 xj+1 y j+1 Ax j 2.
28 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( )
29 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x x j+1 = ax j
30 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x The cost function is then x j+1 = ax j c(, a) = = NX 1 j=1 NX 1 j=1 Aj+1 Ã j+1 P j+1 xj+1 y j+1 Pj+1 ax j 2 NX 1 = j=1 2 NX 1 = j=1 xj+1 x j+1 y j+1 ỹ j+1 2 (y j+1 ) 2 +(x j+1 ax j ) 2
31 PDC SIMPLEST CASE n =2 Let us call z = A P.We look for a one-dimensional submanifold x of z x = A cos( )+P sin( ) y = A sin( )+P cos( ) Additionally we look for a dynamic in x The cost function is then x j+1 = ax j c(, a) = = NX 1 j=1 NX 1 j=1 Aj+1 Ã j+1 P j+1 xj+1 y j+1 Pj+1 ax j 2 NX 1 = j=1 2 NX 1 = j=1 xj+1 x j+1 y j+1 ỹ j+1 2 (y j+1 ) 2 +(x j+1 ax j ) 2 By contrast, the corresponding cost function for regular principal components in this 2-dimensional scenario is c PCA ( ) = NX y 2 j. j=1
32 SYNTHETIC EXAMPLE We created data from the dynamical model x j+1 = ax j +0.3 x j y j+1 = 0.6 y j where a =0.6, j =1,...,999 and x,y j from a normal distribution. are independent samples
33 SYNTHETIC EXAMPLE We created data from the dynamical model x j+1 = ax j +0.3 x j y j+1 = 0.6 y j where a =0.6, j =1,...,999 and x,y j from a normal distribution. are independent samples Then we rotated the data through the angle = 3 A j = x j cos( ) y j sin( ) P j = x j sin( )+y j cos( ) and we perform descent over the variables a and.
34 SYNTHETIC EXAMPLE P 0 a A step θ c step step
35 NONAUTONOMOUS PROBLEMS
36 NONAUTONOMOUS PROBLEMS Take n =2,m= 1. We created data from the dynamical model x j+1 = a j x j + b j +0.3 x j, y j+1 = ȳ j y j, for j =1,...,999 and we adopted the values a j dynamics, b j = 1 2 sin 2 tj T for the drift, and ȳ j non-zero mean of y, wheret j = j and T = 12. = 6 5 cos2 2 tj T = 2 5 cos 2 tj T for the for the
37 NONAUTONOMOUS PROBLEMS Take n =2,m= 1. We created data from the dynamical model x j+1 = a j x j + b j +0.3 x j, y j+1 = ȳ j y j, for j =1,...,999 and we adopted the values a j dynamics, b j = 1 2 sin 2 tj T for the drift, and ȳ j non-zero mean of y, wheret j = j and T = 12. = 6 5 cos2 2 tj T = 2 5 cos 2 tj T for the for the Then we rotated the data through where j = 6 sin 2 tj T A j = x j cos( j ) y j sin( j ), P j = x j sin( j )+y j cos( j ),.
38 NONAUTONOMOUS PROBLEMS P 0 a(t) A t j b(t) 0 θ(t) t j 0.8 t j \bar{y}(t) 0 c t j step
39 HIGHER-ORDER PROCESSES
40 HIGHER-ORDER PROCESSES Take n = 2, m = 1, and r = 3, the order of the Non-Markovian process. We created data from the dynamical model x j+1 = a 1 x j + a 2 x j 1 + a 3 x j x j, y j+1 = 0.6 y j, for j =3,...,999 and we adopted the values a 1 =0.4979,a 2 = ,a 3 = for the dynamics
41 HIGHER-ORDER PROCESSES Take n = 2, m = 1, and r = 3, the order of the Non-Markovian process. We created data from the dynamical model x j+1 = a 1 x j + a 2 x j 1 + a 3 x j x j, y j+1 = 0.6 y j, for j =3,...,999 and we adopted the values a 1 =0.4979,a 2 = ,a 3 = for the dynamics Then, as before, we define A j = x j cos( ) y j sin( ), P j = x j sin( )+y j cos( ), with = 3, and provide the A j and P j as data for the principal dynamical component routine.
42 HIGHER-ORDER PROCESSES P A a step a a step step θ 0.5 c step step
43 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD
44 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009).
45 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009). The ocean is not an isolated player in climate dynamics: it interacts with the atmosphere and the continents, and is also a ected by external conditions, like solar radiation or human-related release of CO 2 into the atmosphere. The latter are examples of slowly varying external trends that fit naturally into our non-autonomous setting.
46 APPLICATION TO THE GLOBAL SEA-SURFACE TEMPERATURE FIELD Preliminary considerations Database: monthly averaged extended reconstructed global sea surface temperatures based on COADS data (January 1854 to October 2009). The ocean is not an isolated player in climate dynamics: it interacts with the atmosphere and the continents, and is also a ected by external conditions, like solar radiation or human-related release of CO 2 into the atmosphere. The latter are examples of slowly varying external trends that fit naturally into our non-autonomous setting. Even within the ocean, the surface temperature does not evolve alone: it is carried by currents, and it interacts through mixing with lower layers of the ocean. One way to account for unobserved variables is to make the model non-markovian.
47 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0 ) n = 50 and N = 1858
48 CHOICE OF DIMENSION AND ORDER
49 CHOICE OF DIMENSION AND ORDER Order of the process: r =3
50 CHOICE OF DIMENSION AND ORDER Order of the process: r =3
51 CHOICE OF DIMENSION AND ORDER Predictive uncertainty as a function of the dimension m of the reduced dynamical manifold and the order r of the process final error final error order non Markovian process dimension m of the manifold
52 CHOICE OF DIMENSION AND ORDER Predictive uncertainty as a function of the dimension m of the reduced dynamical manifold and the order r of the process final error final error order non Markovian process dimension m of the manifold Therefore, we pick r = 3 and m = 4.
53 REAL AND PREDICTED DYNAMICAL COMPONENTS 8 6 First Second Third Fourth 4 x vs xpred t
54 POINTS OF SIGNIFICANCE: 19, 24, 37, 41 First Second Third Fourth Significance Points in the ocean
55 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0
56 50 POINTS IN THE OCEAN 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0
57 OBSERVED AND PREDICTED TEMPERATURES temperatures t
58 ANOMALIES We consider three-month mean SST anomaly in the following El Niño region: 90N 70N 50N 30N N S S S S 90S 0 40E 80E 120E 160E 160W 120W 80W 40W 0
59 ANOMALIES Anomalies t
60 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean.
61 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc.
62 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc. An interesting question is future prediction. For the case of sea-surface temperatures, after 3 months the prediction is not good. This happens because there are many variables inside that may modify significantly the model. Nevertheless, the prediction could work for other models where there is not so much volatility in the data.
63 CONCLUSIONS This new methodology allows a dimensional reduction of time series seeking a low-dimensional manifold x and a dynamical model x j+1 = D(x j,x j 1,...,t) that minimize the predictive uncertainty of the series. We have tested on synthetic data and with a real application on sea-surface temperature over the ocean. It would be interesting to apply this new methodology to other types of data, like economic, social, etc. An interesting question is future prediction. For the case of sea-surface temperatures, after 3 months the prediction is not good. This happens because there are many variables inside that may modify significantly the model. Nevertheless, the prediction could work for other models where there is not so much volatility in the data. M. D. de la Iglesia and E. G. Tabak, Principal dynamical components, Comm. Pure Appl. Math. 66 (2013), no. 1, 48 82
64
65 THANK YOU
Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition
Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition SYDE 312 Final Project Ziyad Mir, 20333385 Jennifer Blight, 20347163 Faculty of Engineering Department of Systems
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationDiscrete time processes
Discrete time processes Predictions are difficult. Especially about the future Mark Twain. Florian Herzog 2013 Modeling observed data When we model observed (realized) data, we encounter usually the following
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informatione 2 e 1 (a) (b) (d) (c)
2.13 Rotated principal component analysis [Book, Sect. 2.2] Fig.: PCA applied to a dataset composed of (a) 1 cluster, (b) 2 clusters, (c) and (d) 4 clusters. In (c), an orthonormal rotation and (d) an
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationPrincipal Component Analysis!! Lecture 11!
Principal Component Analysis Lecture 11 1 Eigenvectors and Eigenvalues g Consider this problem of spreading butter on a bread slice 2 Eigenvectors and Eigenvalues g Consider this problem of stretching
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationMultivariate Statistics (I) 2. Principal Component Analysis (PCA)
Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation
More information(a)
Chapter 8 Subspace Methods 8. Introduction Principal Component Analysis (PCA) is applied to the analysis of time series data. In this context we discuss measures of complexity and subspace methods for
More informationDimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Dimensionality Reduction CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Visualize high dimensional data (and understand its Geometry) } Project the data into lower dimensional spaces }
More informationAtmospheric QBO and ENSO indices with high vertical resolution from GNSS RO
Atmospheric QBO and ENSO indices with high vertical resolution from GNSS RO H. Wilhelmsen, F. Ladstädter, B. Scherllin-Pirscher, A.K.Steiner Wegener Center for Climate and Global Change University of Graz,
More informationCh.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]
Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] With 2 sets of variables {x i } and {y j }, canonical correlation analysis (CCA), first introduced by Hotelling (1936), finds the linear modes
More informationFactor Analysis and Kalman Filtering (11/2/04)
CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used
More informationSome Time-Series Models
Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationMath 1060 Linear Algebra Homework Exercises 1 1. Find the complete solutions (if any!) to each of the following systems of simultaneous equations:
Homework Exercises 1 1 Find the complete solutions (if any!) to each of the following systems of simultaneous equations: (i) x 4y + 3z = 2 3x 11y + 13z = 3 2x 9y + 2z = 7 x 2y + 6z = 2 (ii) x 4y + 3z =
More information7. Forecasting with ARIMA models
7. Forecasting with ARIMA models 309 Outline: Introduction The prediction equation of an ARIMA model Interpreting the predictions Variance of the predictions Forecast updating Measuring predictability
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationCSC 411 Lecture 12: Principal Component Analysis
CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationCovariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )
Covariance Stationary Time Series Stochastic Process: sequence of rv s ordered by time {Y t } {...,Y 1,Y 0,Y 1,...} Defn: {Y t } is covariance stationary if E[Y t ]μ for all t cov(y t,y t j )E[(Y t μ)(y
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationA time series is called strictly stationary if the joint distribution of every collection (Y t
5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a
More informationFitting Linear Statistical Models to Data by Least Squares: Introduction
Fitting Linear Statistical Models to Data by Least Squares: Introduction Radu Balan, Brian R. Hunt and C. David Levermore University of Maryland, College Park University of Maryland, College Park, MD Math
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More information5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors
EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationPrincipal components: a descent algorithm
Principal components: a descent algorithm Rebeca Salas Boni and Esteban G. Tabak April 24, 2012 Abstract A descent procedure is proposed for the search of low-dimensional subspaces of a high-dimensional
More informationClassical Decomposition Model Revisited: I
Classical Decomposition Model Revisited: I recall classical decomposition model for time series Y t, namely, Y t = m t + s t + W t, where m t is trend; s t is periodic with known period s (i.e., s t s
More informationA Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag
A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data
More informationCOMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION
COMPLEX PRINCIPAL COMPONEN SPECRA EXRACION PROGRAM complex_pca_spectra Computing principal components o begin, click the Formation attributes tab in the AASPI-UIL window and select program complex_pca_spectra:
More informationNotes on Time Series Modeling
Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g
More informationTAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω
ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.
More informationKernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA
Kernel-Based Principal Component Analysis (KPCA) and Its Applications 4//009 Based on slides originaly from Dr. John Tan 1 Nonlinear PCA Natural phenomena are usually nonlinear and standard PCA is intrinsically
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationApplication of a multivariate autoregressive model to generate inflow scenarios using ensemble climate forecast
Federal University of Rio de Janeiro Department of Civil Engineering - COPPE/ UFRJ Fluminense Federal University Department of Agriculture Engineering and Environmental Studies Application of a multivariate
More informationSTA 6857 VAR, VARMA, VARMAX ( 5.7)
STA 6857 VAR, VARMA, VARMAX ( 5.7) Outline 1 Multivariate Time Series Modeling 2 VAR 3 VARIMA/VARMAX Arthur Berg STA 6857 VAR, VARMA, VARMAX ( 5.7) 2/ 16 Outline 1 Multivariate Time Series Modeling 2 VAR
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems
More informationstatistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI
statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI tailored seasonal forecasts why do we make probabilistic forecasts? to reduce our uncertainty about the (unknown) future
More informationFSAN/ELEG815: Statistical Learning
: Statistical Learning Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware 3. Eigen Analysis, SVD and PCA Outline of the Course 1. Review of Probability 2. Stationary
More informationDimension Reduction. David M. Blei. April 23, 2012
Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do
More information16.584: Random Vectors
1 16.584: Random Vectors Define X : (X 1, X 2,..X n ) T : n-dimensional Random Vector X 1 : X(t 1 ): May correspond to samples/measurements Generalize definition of PDF: F X (x) = P[X 1 x 1, X 2 x 2,...X
More informationE = UV W (9.1) = I Q > V W
91 9. EOFs, SVD A common statistical tool in oceanography, meteorology and climate research are the so-called empirical orthogonal functions (EOFs). Anyone, in any scientific field, working with large
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationCollaborative Filtering: A Machine Learning Perspective
Collaborative Filtering: A Machine Learning Perspective Chapter 6: Dimensionality Reduction Benjamin Marlin Presenter: Chaitanya Desai Collaborative Filtering: A Machine Learning Perspective p.1/18 Topics
More informationJoint Factor Analysis for Speaker Verification
Joint Factor Analysis for Speaker Verification Mengke HU ASPITRG Group, ECE Department Drexel University mengke.hu@gmail.com October 12, 2012 1/37 Outline 1 Speaker Verification Baseline System Session
More informationMultivariate Distributions
Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review
More informationComputation. For QDA we need to calculate: Lets first consider the case that
Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the
More informationLinear Factor Models. Sargur N. Srihari
Linear Factor Models Sargur N. srihari@cedar.buffalo.edu 1 Topics in Linear Factor Models Linear factor model definition 1. Probabilistic PCA and Factor Analysis 2. Independent Component Analysis (ICA)
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationLegendre s Equation. PHYS Southern Illinois University. October 18, 2016
Legendre s Equation PHYS 500 - Southern Illinois University October 18, 2016 PHYS 500 - Southern Illinois University Legendre s Equation October 18, 2016 1 / 11 Legendre s Equation Recall We are trying
More information1 Kalman Filter Introduction
1 Kalman Filter Introduction You should first read Chapter 1 of Stochastic models, estimation, and control: Volume 1 by Peter S. Maybec (available here). 1.1 Explanation of Equations (1-3) and (1-4) Equation
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationFunctional time series
Rob J Hyndman Functional time series with applications in demography 4. Connections, extensions and applications Outline 1 Yield curves 2 Electricity prices 3 Dynamic updating with partially observed functions
More informationECE534, Spring 2018: Solutions for Problem Set #5
ECE534, Spring 08: s for Problem Set #5 Mean Value and Autocorrelation Functions Consider a random process X(t) such that (i) X(t) ± (ii) The number of zero crossings, N(t), in the interval (0, t) is described
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationVAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:
VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector
More informationLECTURE 16: PCA AND SVD
Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal
More informationLecture 4: Least Squares (LS) Estimation
ME 233, UC Berkeley, Spring 2014 Xu Chen Lecture 4: Least Squares (LS) Estimation Background and general solution Solution in the Gaussian case Properties Example Big picture general least squares estimation:
More informationMLCC 2015 Dimensionality Reduction and PCA
MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component
More informationIntroduction PCA classic Generative models Beyond and summary. PCA, ICA and beyond
PCA, ICA and beyond Summer School on Manifold Learning in Image and Signal Analysis, August 17-21, 2009, Hven Technical University of Denmark (DTU) & University of Copenhagen (KU) August 18, 2009 Motivation
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationWrite your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.
2016 Booklet No. Test Code : PSA Forenoon Questions : 30 Time : 2 hours Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.
More informationDimension Reduction and Low-dimensional Embedding
Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationPrincipal component analysis
Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More informationIntroduction to Probability and Stocastic Processes - Part I
Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More information1. Introduction Systems evolving on widely separated time-scales represent a challenge for numerical simulations. A generic example is
COMM. MATH. SCI. Vol. 1, No. 2, pp. 385 391 c 23 International Press FAST COMMUNICATIONS NUMERICAL TECHNIQUES FOR MULTI-SCALE DYNAMICAL SYSTEMS WITH STOCHASTIC EFFECTS ERIC VANDEN-EIJNDEN Abstract. Numerical
More information5: MULTIVARATE STATIONARY PROCESSES
5: MULTIVARATE STATIONARY PROCESSES 1 1 Some Preliminary Definitions and Concepts Random Vector: A vector X = (X 1,..., X n ) whose components are scalarvalued random variables on the same probability
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationIntelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham
Intelligent Data Analysis Principal Component Analysis Peter Tiňo School of Computer Science University of Birmingham Discovering low-dimensional spatial layout in higher dimensional spaces - 1-D/3-D example
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationSensitivity Analysis for the Problem of Matrix Joint Diagonalization
Sensitivity Analysis for the Problem of Matrix Joint Diagonalization Bijan Afsari bijan@math.umd.edu AMSC Department Sensitivity Analysis for the Problem of Matrix Joint Diagonalization p. 1/1 Outline
More informationFundamentals of Unconstrained Optimization
dalmau@cimat.mx Centro de Investigación en Matemáticas CIMAT A.C. Mexico Enero 2016 Outline Introduction 1 Introduction 2 3 4 Optimization Problem min f (x) x Ω where f (x) is a real-valued function The
More information5 Operations on Multiple Random Variables
EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y
More informationLinear Algebra, part 2 Eigenvalues, eigenvectors and least squares solutions
Linear Algebra, part 2 Eigenvalues, eigenvectors and least squares solutions Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2013 Main problem of linear algebra 2: Given
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationPredictive spatio-temporal models for spatially sparse environmental data. Umeå University
Seminar p.1/28 Predictive spatio-temporal models for spatially sparse environmental data Xavier de Luna and Marc G. Genton xavier.deluna@stat.umu.se and genton@stat.ncsu.edu http://www.stat.umu.se/egna/xdl/index.html
More informationPreliminary algebra. Polynomial equations. and three real roots altogether. Continue an investigation of its properties as follows.
978-0-51-67973- - Student Solutions Manual for Mathematical Methods for Physics and Engineering: 1 Preliminary algebra Polynomial equations 1.1 It can be shown that the polynomial g(x) =4x 3 +3x 6x 1 has
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationData Mining Lecture 4: Covariance, EVD, PCA & SVD
Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The
More informationvariability and their application for environment protection
Main factors of climate variability and their application for environment protection problems in Siberia Vladimir i Penenko & Elena Tsvetova Institute of Computational Mathematics and Mathematical Geophysics
More information