Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Similar documents
Kernel-based Methods and Support Vector Machines

Dimensionality reduction Feature selection

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD

Discriminative Feature Extraction and Dimension Reduction

An Introduction to. Support Vector Machine

Data Analysis and Dimension Reduction

Dimensionality reduction Feature selection

13. Artificial Neural Networks for Function Approximation

Rademacher Complexity. Examples

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Fault Diagnosis Using Feature Vectors and Fuzzy Fault Pattern Rulebase

Functions of Random Variables

Application of Legendre Bernstein basis transformations to degree elevation and degree reduction

Announcements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Correlation and Simple Linear Regression

Simple Linear Regression

Supervised learning: Linear regression Logistic regression

QR Factorization and Singular Value Decomposition COS 323

OPTIMAL LAY-OUT OF NATURAL GAS PIPELINE NETWORK

Dimensionality Reduction

Applications of Multiple Biological Signals

L5 Polynomial / Spline Curves

Model Fitting, RANSAC. Jana Kosecka

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

LECTURE 9: Principal Components Analysis

Chapter 11 Systematic Sampling

Nonlinear Blind Source Separation Using Hybrid Neural Networks*

Objectives of Multiple Regression

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

TESTS BASED ON MAXIMUM LIKELIHOOD

Regression and the LMS Algorithm

MEASURES OF DISPERSION

Towards Multi-Layer Perceptron as an Evaluator Through Randomly Generated Training Patterns

A Robust Total Least Mean Square Algorithm For Nonlinear Adaptive Filter

ρ < 1 be five real numbers. The

Introduction to local (nonparametric) density estimation. methods

ENGI 3423 Simple Linear Regression Page 12-01

Law of Large Numbers

Chapter 8. Inferences about More Than Two Population Central Values

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

Chapter Two. An Introduction to Regression ( )

Parameter, Statistic and Random Samples

Compressive sensing sparse sampling method based on principal component analysis

Bayes (Naïve or not) Classifiers: Generative Approach

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Lecture 3. Sampling, sampling distributions, and parameter estimation

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

A New Development on ANN in China Biomimetic Pattern Recognition and Multi Weight Vector Neurons

Simple Linear Regression and Correlation.

Binary classification: Support Vector Machines

A Sensitivity-Based Adaptive Architecture Pruning Algorithm for Madalines

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Class 13,14 June 17, 19, 2015

Lecture 3. Least Squares Fitting. Optimization Trinity 2014 P.H.S.Torr. Classic least squares. Total least squares.

6. Nonparametric techniques

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Study on a Fire Detection System Based on Support Vector Machine

Lecture 12: Multilayer perceptrons II

Radial Basis Function Networks

9.1 Introduction to the probit and logit models

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Lecture 23: Associative memory & Hopfield Networks

Dimensionality Reduction and Learning

x y exp λ'. x exp λ 2. x exp 1.

4. Standard Regression Model and Spatial Dependence Tests

An Improved Support Vector Machine Using Class-Median Vectors *

ESS Line Fitting

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Algebraic-Geometric and Probabilistic Approaches for Clustering and Dimension Reduction of Mixtures of Principle Component Subspaces

Statistics: Unlocking the Power of Data Lock 5

Approximation Capabilities of Adaptive Spline Neural Networks

Nonparametric Techniques

Lecture 8: Linear Regression

CHAPTER VI Statistical Analysis of Experimental Data

CSE 5526: Introduction to Neural Networks Linear Regression

Analysis of System Performance IN2072 Chapter 5 Analysis of Non Markov Systems

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

6.867 Machine Learning

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Generalized Linear Regression with Regularization

Analysis of Lagrange Interpolation Formula

An investigative study on the influence of correlation of PD statistical features on PD pattern recognition

Chapter 14 Logistic Regression Models

Logistic regression (continued)

Lecture 3 Probability review (cont d)

Support vector machines

ANALYSIS ON THE NATURE OF THE BASIC EQUATIONS IN SYNERGETIC INTER-REPRESENTATION NETWORK

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Chapter 9 Jordan Block Matrices

Overcoming Limitations of Sampling for Aggregation Queries

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Transforms that are commonly used are separable

Homework Solution (#5)

Transcription:

Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher Dscover patters the put data Reles o local learg rules Utlzes Hebba learg Heavl spred b bolog 2 Basc Ituto Global order ca arse from local teractos Ala Turg The effectveess of a varable sapse betwee two euros s creased b the repeated actvato of oe euro b the other across the sapse Hebb Combe the two hpothess, ad we have a global learg strateg 3

Self Orgazed Learg Prcpals Modfcatos saptc weghts ted to self amplf Lmtato of resources leads to competto amog sapses ad therefore the selecto of the most vgorousl growg sapses at the expese of the others 4 Self Orgazed Learg Prcpals Modfcatos saptc weghts ted to cooperate Order ad structure the actvato patters represet redudat formato that s acqured b the eural etwork the form of kowledge, whch s a ecessar prerequste to self-orgazed learg 5 Self Orgazed Learg for vsual Applcatos Multple laer etworks, each laer s resposble for a certa feature Smple features e.g. cotrast, edges are hadled b earl laers, ad complex features e.g. surface texture are hadled b later laers Learg proceeds o a laer b laer bass 6

Tpcal Model Each euro acts as a lear comber Coectos are fxed throughout the learg process,.e. coectos ma be stregtheed or weakeed but ot added or destroed 7 Feature Extracto The process of trasformg a data space to a feature space of the same dmeso To ad data aalss, the trasformato t ca be from a put space of a gve dmeso to a feature space of a smaller dmeso The trasformato should be vertble ad should mmze the mea-square error of the resultg space 8 Prcpal Compoets Aalss For a put vector X: Normalze X to have a zero mea DefeamatrxR= EXX T The ormalzed Eucldea orm = 1 egevectors of the matrx R defe the drectos of maxmum varace The correspodg egevalues defe the correspodg varace values 9

Feature Mappg The egevectors of the matrx R are called the prcpal drectos. Let s call these vectors q The proecto of the put vector x oto the prcpal vector are called the prcpal compoets: a =q T x=x T q B substtutg the vector A stead of the vector x, the data s trasformed to a prcpal compoets feature space 10 PCA Illustrated 11 Dmesoalt Reducto Features ca be ordered order of domace accordg to the values of the varaces = ege values To reduce the dmesoalt of the feature space, clude the feature space the domat l compoets where l < dmesoalt of the put space The resultg error s defed as: m e = 1a q = l+ 12

Example Orgal represetato of characters the form of 32*32 pxels a vector of sze 1024. Images represet creasg dmesoalt up to 64 a reducto b a factor of 16 at rght most colum 13 Hebba Based Maxmum Ege Flter Assume a sgle lear euro as follows: x 1 w 1 1 w 2 x 2 m = w x = 1 x m w m 14 Hebba Based Maxmum Ege Flter Defe the learg rule: w + 1 = m w + η x 2 [ w + η x ] = 1 For a small learg rate, ths approxmates to: [ x w ] w + 1 = w + η 15

Hebba Based Maxmum Ege Flter We ca prove that: The varace of the output coverges to the largest egevalue of the correlato matrx R=X T X The saptc weght vector W coverges to the correspodg ormalzed ege vector 16 Hebba Based Prcpal Compoets Aalss Ca the sgle euro maxmum ege flter be exteded to the other prcpal compoets? How? 17 Hebba Based Prcpal Compoets Aalss Assume, a eural etwork formed of a sgle laer of lear euros wth umber of euros <= umber of puts: 18

Hebba Based Prcpal Compoets Aalss The outputs are defed b: m = w x = 1 Defe the learg rule: w = η x k = 1 wk k 19 Hebba Based Prcpal Compoets Aalss The egatve compoet has the effect of subtractg the prcpal compoets represeted b the prevous outputs We ca prove that: The varaces of the outputs coverge to the egevalues of the correlato matrx R=X T X descedg order The saptc weght vectors W coverge to the correspodg ormalzed ege vectors 20 Example 1 21

Example 2 22 Adaptve Prcpal Compoets Aalss Tra the etwork, oe euro at a tme We use feedback from each euro to all the euros that follow t order 23 Adaptve Prcpal Compoets Aalss For trag of euro, we use a etwork: 24

Adaptve Prcpal Compoets Aalss The output of euro s defed as: 1 a x w T T + = 25 Ad the learg rules: Ad the learg rules: ] [ 1 ] [ 1 a x a a w x w w = + + = + η η Kerel PCA Itroduces a o-lear hdde lear to covert o lear put spaces to a equvalet lear set of prcpal t 26 compoets Kerel PCA Example 27

Image Codg Strateges Compact codg: represet the mage as a reduced set of vectors desged to mmze errors, e.g. PCA Sparse dstrbuted codg: trasform redudac mage represetato to match redudac recogto vsual sstems 28