Basics of Multivariate Modelling and Data Analysis
|
|
- Melinda Wood
- 5 years ago
- Views:
Transcription
1 Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis 2.2 Classification of multivariate techniques KEH Basics of Multivariate Modelling and Data Analysis 1
2 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis Two main approaches Essentially, there are two different approaches: Model-driven approach, where data is seen as realizations of random variables of an underlying statistical model. This interpretation is usually favoured by statisticians. Data-driven approach, where the statistical tools are basically seen as algorithms to obtain results. This view is typical of chemometricians and data miners. However, these are not two different ways for solving the same problem. Traditional statistical (model-driven) methods do not work well when the statistical properties are unknown (i.e. assumptions are not fulfilled) the number of variables is large compared to the number of objects (samples, observations, measurements), sometimes larger than the variables are strongly correlated Data-driven methods are developed to handle this kind of data. KEH Basics of Multivariate Modelling and Data Analysis 2
3 2. Overview of multivariate techniques 2.1 Different approaches Chemometrics Chemometrics is the science of extracting information from chemical systems (chemistry, biochemistry, chemical engineering, etc.) by data-driven means a highly cross-disciplinary activity using methods from applied mathematics, statistics, informatics and computer science applied to datasets which are often very large and highly complex, involving hundreds to tens of thousands of variables, and hundreds to millions of cases or observations applied to solve both descriptive and predictive problems descriptive application: modelling with the intent of learning the underlying relationships and structure of the system (i.e. model identification) predictive application: modelling with the intent of predicting new properties or behaviour of interest We will mainly deal with applications of chemometrics in this course. KEH Basics of Multivariate Modelling and Data Analysis 3
4 2. Overview of multivariate techniques 2.1 Different approaches Data mining Data mining is the science of extracting information from large data sets and databases an integration of techniques from statistics, mathematics, machine learning, database technology, data visualization, pattern recognition, signal processing, information retrieval, high-performance computing, often applied to huge data sets involving gigabytes (2 30 ~10 9 bytes; the Human Genome Project), terabytes (2 40 ~10 12 bytes; space and earth sciences), soon even petabytes (2 50 ~10 15 bytes) applied to solve both descriptive and predictive problems descriptive data mining (or unsupervised learning): search massive data sets and discover the locations of unexpected structures or relationships, patterns, trends, clusters and outliers predictive data mining (or supervised learning): build models and procedures for regression, classification, pattern recognition and machine learning tasks; assess the predictive accuracy of those methods when applied to new data Obviously, data mining methods are also used in chemometrics. KEH Basics of Multivariate Modelling and Data Analysis 4
5 2. Overview of multivariate techniques 2.2 Classification of multivariate techniques Data matrices Data is assumed to be available in a data matrix X, where each column x j, j = 1,, p, contains n observations (measurements). Each column in X represents a variable and each row contains measurements of all variables in a sample (or time instant). These variables are termed independent variables (although they may be highly correlated). In addition there may be data of a number of dependent variables in a matrix Y, where each column y k, k = 1,, q, contains n observations. Each column in Y represents a variable and each row contains measurements of all dependent variables in a sample (or time instant). X = Y = x 11 x 21 x n1 x 12 x 22 xn2 x1 x2 x 1p x 2 p y 11 y 21 y n1 y 12 y 22 yn2 y y1 2 xnp x y q p y 1q y 2q ynq KEH Basics of Multivariate Modelling and Data Analysis 5
6 2. Overview of multivariate techniques 2.2 Classification Main classification criterion Our main classification criterion is the number of dependent variables. The classes are no dependent variable, q = 0 one dependent variable, q = 1 many dependent variables, q > 1 Note that this classification is determined by how we choose to treat the data, not by the true (unknown) dependencies in the data set. Modelling (like regression) is termed simple, if there is only one independent variable (e.g. simple regression) multiple, if there is one dependent variable but many independent ones multivariate, if there is many dependent and many independent variables The case with one independent variable is handled by classical univariate statistics, and will not be treated here. In addition, variables may be classified according to the type of measurement: metric (quantitative, ~ continuous) nonmetric (qualitative, categorical, discrete, often binary) KEH Basics of Multivariate Modelling and Data Analysis 6
7 2. Overview of multivariate techniques 2.2 Classification No dependent variable In these methods, only the data matrix X is considered. The data may be metric or nonmetric Principal component analysis (PCA) PCA is a method that can be used to analyse interrelationships among a large number of variables explain these variables in terms of their common underlying components condense the information in a number of original variables into a smaller set of principal components with a minimal loss of information Mathematically, we want to find the weights p jl that maximize the variance of each t l in such a way that every t l is uncorrelated with every t m, m l. t = x p + x p + + x p t = x p + x p + + x p t = x p + x p + + x p p p p p2 a 1 1a 2 2a p pa Here a is the number of principal components. If cov(t l ) 0, there is no useful information in t l. Only components that contain useful information are retained. Usually a p (often a = 2 4 ). KEH Basics of Multivariate Modelling and Data Analysis 7
8 2.2 Classification of multivariate techniques No dependent variable Factor analysis (FA) Factor analysis is similar to PCA and can be used for the same purpose. However, unlike PCA, FA is based on a statistical model with certain assumptions. x t t t e FA FA FA FA FA FA = p + p + + p a a 1 FA FA FA FA FA FA x = p t + p t + + p t + e a a 2 x p t p t p t e FA FA FA FA FA FA = p p11 p22 pa a p Mathematically, we want to find the weights p FA jl and the factors t FA l so that the error variance behaves in a certain way. Example. Consumer rating. Assume customers in a fast-food restaurant are asked to rate the restaurant on the following six variables: food taste, food temperature, food freshness, waiting time, cleanliness, friendliness of employees. Analysis of the customer responses by factor analysis may show that the variables food taste, temperature and freshness combine together to form a single factor food quality, whereas waiting time, cleanliness and friendliness form a factor service quality. KEH Basics of Multivariate Modelling and Data Analysis 8
9 2.2 Classification of multivariate techniques No dependent variable Cluster analysis (FA) Cluster analysis is an analytical technique for developing meaningful subgroups of individuals or objects. The subgroups are not predefined; they are identified by the analysis. KEH Basics of Multivariate Modelling and Data Analysis 9
10 2. Overview of multivariate techniques 2.2 Classification One dependent variable In these methods, a vector of dependent variables, Y = y, is considered in addition to the data matrix X. The data my be metric or nonmetric Multiple regression analysis (MRA) MRA may be used to relate a single metric dependent variable to a number of independent variables. predict changes in the dependent variable in response to changes in the independent variables. Mathematically, we want to find the y= b0+ b1x1+ b2x2+ + bpxp + e parameters b j, j = 0,, p, that maximize the correlation between yˆ = b0+ b1x1+ b2x2+ + bpxp y and the prediction y. This is equivalent to minimizing the variance of the error e or the sum of the squared residuals. Note: This method does not work well if the independent variables are (strongly) correlated. KEH Basics of Multivariate Modelling and Data Analysis 10
11 2.2 Classification of multivariate techniques One dependent variable Multiple discriminant analysis (MDA) MDA is the appropriate multivariable technique if the single dependent variable y is nonmetric, either dichotomous (e.g., male female), or multichotomous (e.g., high medium low). The independent variables x j are assumed to be metric. Thus, discriminant analysis is applicable when the total sample can be divided into groups based on a nonmetric dependent variable characterizing several known classes. The primary objectives of MDA are to understand groups differences predict the likelihood that an entity (individual or object) will belong to a particular class or group based on several metric independent variables KEH Basics of Multivariate Modelling and Data Analysis 11
12 2.2 Classification of multivariate techniques One dependent variable Logistic regression Logistic regression models, often referred to as logit analysis, are a combination of multiple regression analysis (MRA) many independent variables multiple discriminant analysis (MDA) nonmetric dependent variable The difference with these methods is that the independent variables may be may be metric or nonmetric do not require the assumption of multivariate normality In many cases, particularly with more than two levels of the dependent variable, MDA is the more appropriate technique. KEH Basics of Multivariate Modelling and Data Analysis 12
13 2. Overview of multivariate techniques 2.2 Classification Many dependent variables In these methods, a matrix of dependent variables, Y, is considered in addition to the data matrix X. The data my be metric or nonmetric Canonical correlation analysis (CCA) Canonical correlation analysis can be viewed as a logical extension of multiple regression analysis (a single metric dependent and several metric independent variables). With CCA the objective is to correlate simultaneously several metric dependent and several metric independent variables The underlying principle is to develop a linear combination of each set of variables t = p1x1+ p2x2+ + ppxp u= q y + q y + + q y maximize the correlation cov( tu, ) cor( tu, ) = std( t) std( u) with respect to the parameters p j, j = 1,, p, and q k, k = 1,, q. q q KEH Basics of Multivariate Modelling and Data Analysis 13
14 2.2 Classification of multivariate techniques Many dependent variables Partial least squares (PLS) PLS stands for Partial Least Squares, or Projection to Latent Structures It is a method to relate a matrix X to a vector y or a matrix Y. Similarly to CCA, linear combinations of the X and Y data are formed: t = x1p1 + x2p2 + + xppp, = 1,, a u = y q + y q + + y q, m= 1,, b m 1 1m 2 2m q qm In PLS, p jl, j = 1,, p, and q km, k = 1,, q, are (usually) determined so that the covariances between t l and u m, l, m, are maximized. This combines high variance of t l and u m (i.e., high information content) high correlation between t l and u m (good for predictive modelling) In addition, linear relationships between t l and u m are determined by ordinary least squares (OLS). There is a similar method, principal component regression (PCR), where p jl, j = 1,, p, are determined by principal component analysis (PCA) of X. KEH Basics of Multivariate Modelling and Data Analysis 14
15 2.2 Classification of multivariate techniques Many dependent variables Independent component analysis (ICA) In general, ICA tries to reveal the independent factors (variables/signals) from a set of mixed random variables or measurement signals. ICA is typically restricted to linear mixtures, and the underlying sources are assumed to be mutually independent. This is a stronger assumption than no correlation (which only concerns the first two moments of probability distributions)! It is assumed that the observed random variables x 1,, x p, denoted by the observation matrix X, are the result of a linear combination of m underlying sources s 1,, s m, denoted by the source matrix S. The following model, called a noise-free ICA model, can then be assumed: X = AS T Here A is a full rank n m matrix, where n is the number of observations. Both A and S are unknown, but if the sources are statistically independent, it is possible to find A. The sources are then found according to S T = A 1 X KEH Basics of Multivariate Modelling and Data Analysis 15
PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationSTATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS
STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables
More informationDimension Reduction (PCA, ICA, CCA, FLD,
Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationMachine Learning 11. week
Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately
More information7. Variable extraction and dimensionality reduction
7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: (Finish) Expectation Maximization Principal Component Analysis (PCA) Readings: Barber 15.1-15.4 Dhruv Batra Virginia Tech Administrativia Poster Presentation:
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationABTEKNILLINEN KORKEAKOULU
Two-way analysis of high-dimensional collinear data 1 Tommi Suvitaival 1 Janne Nikkilä 1,2 Matej Orešič 3 Samuel Kaski 1 1 Department of Information and Computer Science, Helsinki University of Technology,
More informationMachine Learning! in just a few minutes. Jan Peters Gerhard Neumann
Machine Learning! in just a few minutes Jan Peters Gerhard Neumann 1 Purpose of this Lecture Foundations of machine learning tools for robotics We focus on regression methods and general principles Often
More informationLecture 7: Con3nuous Latent Variable Models
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More informationArtificial Intelligence Module 2. Feature Selection. Andrea Torsello
Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space
More information6.867 Machine Learning
6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationNeuroscience Introduction
Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationLeverage Sparse Information in Predictive Modeling
Leverage Sparse Information in Predictive Modeling Liang Xie Countrywide Home Loans, Countrywide Bank, FSB August 29, 2008 Abstract This paper examines an innovative method to leverage information from
More informationAn Introduction to Statistical and Probabilistic Linear Models
An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationStatistics Toolbox 6. Apply statistical algorithms and probability models
Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationMultivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques
Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems
More informationUncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization
Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,
More informationLearning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationMultivariate Statistics Fundamentals Part 1: Rotation-based Techniques
Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques A reminded from a univariate statistics courses Population Class of things (What you want to learn about) Sample group representing
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationLinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationRevision: Chapter 1-6. Applied Multivariate Statistics Spring 2012
Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing
More informationNotes on Latent Semantic Analysis
Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationCS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)
CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions
More informationAn introduction to clustering techniques
- ABSTRACT Cluster analysis has been used in a wide variety of fields, such as marketing, social science, biology, pattern recognition etc. It is used to identify homogenous groups of cases to better understand
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationPRINCIPAL COMPONENTS ANALYSIS
121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves
More informationLinear & Non-Linear Discriminant Analysis! Hugh R. Wilson
Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson PCA Review! Supervised learning! Fisher linear discriminant analysis! Nonlinear discriminant analysis! Research example! Multiple Classes! Unsupervised
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationDrift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares
Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares R Gutierrez-Osuna Computer Science Department, Wright State University, Dayton, OH 45435,
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationUnsupervised Learning Methods
Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational
More informationInternational Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA
International Journal of Pure and Applied Mathematics Volume 19 No. 3 2005, 359-366 A NOTE ON BETWEEN-GROUP PCA Anne-Laure Boulesteix Department of Statistics University of Munich Akademiestrasse 1, Munich,
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More informationMachine Learning for Data Science (CS4786) Lecture 12
Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We
More informationCorrelation Preserving Unsupervised Discretization. Outline
Correlation Preserving Unsupervised Discretization Jee Vang Outline Paper References What is discretization? Motivation Principal Component Analysis (PCA) Association Mining Correlation Preserving Discretization
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationUnconstrained Ordination
Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)
More informationCS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis
CS 3 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis AD March 11 AD ( March 11 1 / 17 Multivariate Gaussian Consider data { x i } N i=1 where xi R D and we assume they are
More informationPredictive analysis on Multivariate, Time Series datasets using Shapelets
1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationCourse content (will be adapted to the background knowledge of the class):
Biomedical Signal Processing and Signal Modeling Lucas C Parra, parra@ccny.cuny.edu Departamento the Fisica, UBA Synopsis This course introduces two fundamental concepts of signal processing: linear systems
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationIndependent Component Analysis
1 Independent Component Analysis Background paper: http://www-stat.stanford.edu/ hastie/papers/ica.pdf 2 ICA Problem X = AS where X is a random p-vector representing multivariate input measurements. S
More informationTensor Methods for Feature Learning
Tensor Methods for Feature Learning Anima Anandkumar U.C. Irvine Feature Learning For Efficient Classification Find good transformations of input for improved classification Figures used attributed to
More informationComparative Analysis of ICA Based Features
International Journal of Emerging Engineering Research and Technology Volume 2, Issue 7, October 2014, PP 267-273 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Comparative Analysis of ICA Based Features
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMachine Learning Linear Regression. Prof. Matteo Matteucci
Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationSTATS 306B: Unsupervised Learning Spring Lecture 2 April 2
STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised
More informationIntroduction to Machine Learning
Introduction to Machine Learning CS4731 Dr. Mihail Fall 2017 Slide content based on books by Bishop and Barber. https://www.microsoft.com/en-us/research/people/cmbishop/ http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=brml.homepage
More informationCPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018
CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,
More informationCorrelation and regression
NST 1B Experimental Psychology Statistics practical 1 Correlation and regression Rudolf Cardinal & Mike Aitken 11 / 12 November 2003 Department of Experimental Psychology University of Cambridge Handouts:
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More information26:010:557 / 26:620:557 Social Science Research Methods
26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview
More informationPattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher
Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to
More informationWhat is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.
What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,
More informationGenerative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham
Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More information