A Robust PCA by LMSER Learning with Iterative Error. Bai-ling Zhang Irwin King Lei Xu.

Size: px
Start display at page:

Download "A Robust PCA by LMSER Learning with Iterative Error. Bai-ling Zhang Irwin King Lei Xu."

Transcription

1 A Robust PCA by LMSER Learning with Iterative Error Reinforcement y Bai-ling Zhang Irwin King Lei Xu blzhang@cs.cuhk.hk king@cs.cuhk.hk lxu@cs.cuhk.hk Department of Computer Science The Chinese University of Hong Kong Shatin, N.T. Hong Kong Abstract We propose an approach for performing adaptive principal component extraction. By this approach, the Least Mean Squared Error Reconstruction (LMSER) Principle is implemented in a successive way such that the reconstruction error is fedback as inputs for training the network's weights. Simulations results have shown that this type of LMSER implementation can perform Robust Principal Component Analysis (PCA) which is capable of resisting strong outliers. Introduction Linear neurons learning under an unsupervised Hebbian rule can learn to perform a linear statistical analysis of the input data, as was rst shown by Oja[3, 2] who proposed a learning rule based on a single neuron which nds the rst principal component of the covariance matrix from the input statistics. Later on, a number of researchers have devised numerous neural networks which nd the rst k>principal components of this matrix or a subspace that is spanned by the rst k> principal components, which are called the k-pca or Principal Subspace Analysis (PSA) network. A detailed reference can be found, e.g., in [6]. In 99, Xu [5, 6] proposed to use the Least Mean Squared Error Reconstruction (LMSER) principle for neural network self-organizations. Particularly, the special case for one-layer linear networks has been investigated in detail. It has been shown in [5, 6] that for one-layer networks the LMSER rule performs the PSA that is similar to the one given in [2], which actually descents down hill of the Mean Squared Error of the reconstruction in a direction with a positive projection on the evolution direction of the LMSER rule. Furthermore, the LMSER rule and the Oja's subspace rule can also perform the true k-pca by introducing dierent scaling factors to the output of each neuron. Moreover, it has been discovered in [5, 6] that for one-layer networks with nonlinear sigmoid activation, the LMSER rule can break the symmetry of the PSA to let the weight vector of each neuron to approximate each of the true k> principal components. Recently, Karhunen and Joutsensalo [] used this nonlinear LMSER on the problem of signal separation and showed that it gives a better separation property than other nonlinear PCA approaches. This paper considers the cases of implementing LMSER or PCA on the input data with strong outliers. This problem was studied by [4] in the literature of statistics, called Robust PCA, by block approaches. Based on statistic physics approach, Xu and Yuille [9, 8, ] have developed a set of adaptive Robust PCA learning rules which can resist strong outliers in data. In this paper, we propose a new approach for Robust PCA, which is based on a successive implementation of the above one-layer nonlinear LMSER. This work was supported in parts by the Hong Kong Research Grant Council, NO: y For correspondence, please contact I. King, king@cs.cuhk.hk

2 2 New Approach for Robust PCA For a symmetrically circuited single layer feedforward network, let L M weight matrix W t = [w t () ::: w t (M)] L M, has the weight vectors of the M neurons after t iterations as its columns. y i = x T t w t (i) is the linear output of i-th neuron, z i =(y i ) is the corresponding nonlinear output via a nonlinear function (). Let u = Wz denotes a reconstruction vector through a linear transformation of the output vector z via the weight matrix W, with u =[u ::: u L ] T z =[z ::: z M ] T. The one-layer nonlinear LMSER rule was proposed in [5, 6] with criterion: minimize J(W) =Efkx ; W(W T x)k 2 g () then with W t+ = W t ; t rj(w t ) jx=x t (2) rj(w t ) jx=x t ; j x=x = ;(x t e T t W t z t + e tz T t ) (3) where z t =(x T t W t ) z t = (x T t W t ) e t = x ; u t = x ; W t (Wt T x t ) is the reconstruction error vector. Equation (3) is the same as Eq. rst given in [5] with a slight dierent notation, and later used in Karhunen and Joutsensalo [] as their equation Eq.(6) for the problem of signal separation. Here we propose an approach based on the successive applying the LMSER learning rule for Robust PCA. For each data sample, we calculate a new data sequence by the reconstruction error, then the new data sequence is used to train the network. Initially, the input vector ~x is set equal to the original vector x and the weight vectors w k k = ::: M are all updated via LMSER. From the reconstructed vector u = W(W T ~x ), a new input data ~x 2 = ~x ; u is obtained. In this way, for each data sample x, we form a new augmented input data sequence ~x = x ~x 2 = ~x ; u ~x 3 = ~x 2 ; u 2 ~x M = ~x M; ; u M; u i being the reconstructed data. In other words, the procedure for calculating a new input sequence ~x i = ~x i; ; u i; is repeated M times. These are the iterative error reinforcement steps. Principal component analysis can be formulated as a mean square error minimization problem. In linear network case, the criterion for k-th principal component is minimize J t (w(k)) = Efkx ; w(k)w(k) T ^xk 2 g (4) where ^x =(I ;W(k ;) T W(k ;))x, W(k ;) = [c c k; ] T is the matrix composed of previous k ; eigenvectors of data covariance,. From this, principal eigenvectors can be adaptively calculated one by one via successively utilizing the above minimization procedure. Such a scheme is purely linear, from which nothing new can be expected except PCA. Moreover, parallel updating of the weights is not possible. To keep the parallel updating by a linear PCA network, a set of dierent scaling factors is introduced to the output of each neuron [6, 7]. One of the algorithm proposed in [6] is W t+ = W t + t [x t y T t D ; u ty T t ] (5) where D = diag[ ::: M ] > > M >. This rule performs the true PCA. We combine this idea of scaling factors and the above idea for successive implementation of one-layer nonlinear LMSER. Let the reconstruction square error through the weight of k-th output be: J t = t k~x t ; u t k 2 = t LX i= (~x i ; w i (k)(w(k) T ~x)) 2 (6) Here t is a scaling factor, with the same function as those in Eq.(5). From our experience, we take

3 Taking the gradient of Eq.(6), we have, t = exp(;2ji ; tj) i t= M i6= t = ;2 te t (w(k) T ~x) ; 2 t e T t w(k) (w(k) T ~x) (8) where e t = ~x ; w(k)(w(k) T ~x) is the reconstruction error vector. The gradient descent algorithm in matrix form for the overall system is as follows: W t+ = W t + t D[e t (x T W T t )+e T t W (x T W T t )x] (9) where D = diag( ::: M ). For a given sample x, ifwe let the signal bidirectionally propagate M times, and at each time, t, the network input be updated by reconstruction vector u t as ~x t = ~x t; ;u t. 3 Simulations We generate a set of 5, 4-dimensional random Gaussian data points distributed in an ellipsoidal region centered at the origin of R 4 as shown in Fig.. Among these 5 sample points, outliers points were selected and replaced by the original value with a time amplication as illustrated in Fig.. Two experiments were performed with the proposed iterative error reinforcement algorithm. First, we demonstrated the learning results in comparison with the PCA learning scheme{eq. (5) in a semilinear network with amplier factors as A = diag(4 3 2 ). When there were no outliers in the data samples, the proposed heuristics converged to a solution similar to the semi-linear network as shown in Fig. 2. This illustrates that the proposed algorithm is functionally equivalent to the semi-linear PCA network found in Eq. (5) for data samples without outliers. Second, we compared the iterative error reinforcement algorithm with the one without iterative error reinforcement which directly set e t = x ; w(k)(w(k) T x) in Eq.(9). The result is shown in Fig. 2. With outliers in the data sample, both algorithms took longer time to converge. Although both methods converged to an adequate solution demonstrating their robustness against outliers, the iterative error reinforcement method appears to converge to a solution more quickly than the previous nonlinear method as shown in Fig. 2 and 2. 4 Conclusions We propose a new approach for performing adaptive principal component extraction. By this approach, a new input sequence was formed for each data sample and the LMSER learning was applied successively until convergence. This heuristic algorithm is capable of resisting outliers while obtaining PCA basis vectors. Comparative experiments have shown that it is robust and converges more rapidly than previous methods without the iterative reinforcement. References [] Juha Karhunen and Jyrki Jou Joutsensalo. Representation and separation of signals using nonlinear PCA type learning. Neural Networks, 7():3{27, 994. [2] E. Oja. Neural networks, principal components, and subspace. International Journal of Neural Systems, :6{68, 989. [3] Erkki Oja. A simplied neuron model as a principal component analyzer. J. Math. Biology, 5:267{273, 982.

4 Dimensional Gaussian Data Samples Data Sample with Outliers Figure : 4-dimensional Gaussian data samples used in the experiment which wer distributed in an ellipsoidal region. Their projections on x ~x 2 x 2 ~x 3 x 3 ~x 4 x 4 ~x planes are displayed in -, respectively. Data samples contaminated with outliers. Data views from x x 2 x 3 x x 2 x 4 x 2 x 3 x 4 x x 3 x 4 are illustrate in -, respectively. [4] F. H. Ruymagaart. A robust principal component analysis. J. Multivar. Anal., pages 485{497, 98. [5] Lei Xu. Least MSE reconstruction for self-organization: (I) multi-layer neural nets. In Proc. International Joint Conference on Neural Networks, volume II, pages 2362{2367, Singapore, November 99. [6] Lei Xu. Least mean square error reconstruction principle for self-organizing neural-netws. Neural Networks, 6:627{648, 993. [7] Lei Xu. Beyond PCA learnings: From linear to nonlinear and from global representation to local representation. In Myung-Won Kim and Soo-Young Lee, editors, Proceedings to the International Conference on Neural Information Processing, pages Vol. II:943{949, Seoul, Korea, Oct 994. [8] Lei Xu and Yuille. Self-organizing rules for robust principle component analysis. In J. K. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 467{474, San Mateo, 993. Morgan Kaufmann. [9] Lei Xu and A. L. Yuille. Robust PCA learning rules based on statistical physics approach. In Proceedings of International Joint Conference on Neural Networks 992 Baltimore, volume I, pages 82{87, 992. [] Lei Xu and A. L. Yuille. Robust principal component analysis by self-organizing rules based on statistical physics approach. IEEE Trans. on Neural Networks, pages 3{43, 995.

5 Figure 2: The displayed curves are learning results from the semi-linear PCA algorithm Eq. (5) (dashed line) and the proposed approach (solid line) when the data samples were without outliers. The lines are inner products between four weight vectors and corresponding eigenvectors of data covariance matrix which were calculated beforehand which signify a solution when they converge to. Learning results from the proposed iterative error reinforcement algorithm when the data samples were contaminated by outliers and the one without the iterative error reinforcement, i.e., e t = x ; w(k)(w(k) T x) in Eq. (9).

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Clayton Aldern (Clayton_Aldern@brown.edu) Tyler Benster (Tyler_Benster@brown.edu) Carl Olsson (Carl_Olsson@brown.edu)

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

Principal Component Analysis (PCA) for Sparse High-Dimensional Data

Principal Component Analysis (PCA) for Sparse High-Dimensional Data AB Principal Component Analysis (PCA) for Sparse High-Dimensional Data Tapani Raiko, Alexander Ilin, and Juha Karhunen Helsinki University of Technology, Finland Adaptive Informatics Research Center Principal

More information

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis Aapo Hyvarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O.Box 5400,

More information

Multisets mixture learning-based ellipse detection

Multisets mixture learning-based ellipse detection Pattern Recognition 39 (6) 731 735 Rapid and brief communication Multisets mixture learning-based ellipse detection Zhi-Yong Liu a,b, Hong Qiao a, Lei Xu b, www.elsevier.com/locate/patcog a Key Lab of

More information

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing Kernel PCA Pattern Reconstruction via Approximate Pre-Images Bernhard Scholkopf, Sebastian Mika, Alex Smola, Gunnar Ratsch, & Klaus-Robert Muller GMD FIRST, Rudower Chaussee 5, 12489 Berlin, Germany fbs,

More information

A Coupled Helmholtz Machine for PCA

A Coupled Helmholtz Machine for PCA A Coupled Helmholtz Machine for PCA Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 3 Hyoja-dong, Nam-gu Pohang 79-784, Korea seungjin@postech.ac.kr August

More information

IN neural-network training, the most well-known online

IN neural-network training, the most well-known online IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 1, JANUARY 1999 161 On the Kalman Filtering Method in Neural-Network Training and Pruning John Sum, Chi-sing Leung, Gilbert H. Young, and Wing-kay Kan

More information

Kernel Hebbian Algorithm for Iterative Kernel Principal Component Analysis

Kernel Hebbian Algorithm for Iterative Kernel Principal Component Analysis Max Planck Institut für biologische Kybernetik Max Planck Institute for Biological Cybernetics Technical Report No. 109 Kernel Hebbian Algorithm for Iterative Kernel Principal Component Analysis Kwang

More information

Outliers Treatment in Support Vector Regression for Financial Time Series Prediction

Outliers Treatment in Support Vector Regression for Financial Time Series Prediction Outliers Treatment in Support Vector Regression for Financial Time Series Prediction Haiqin Yang, Kaizhu Huang, Laiwan Chan, Irwin King, and Michael R. Lyu Department of Computer Science and Engineering

More information

Constrained Projection Approximation Algorithms for Principal Component Analysis

Constrained Projection Approximation Algorithms for Principal Component Analysis Constrained Projection Approximation Algorithms for Principal Component Analysis Seungjin Choi, Jong-Hoon Ahn, Andrzej Cichocki Department of Computer Science, Pohang University of Science and Technology,

More information

BLIND SEPARATION OF POSITIVE SOURCES USING NON-NEGATIVE PCA

BLIND SEPARATION OF POSITIVE SOURCES USING NON-NEGATIVE PCA BLIND SEPARATION OF POSITIVE SOURCES USING NON-NEGATIVE PCA Erkki Oja Neural Networks Research Centre Helsinki University of Technology P.O.Box 54, 215 HUT, Finland erkki.oja@hut.fi Mark Plumbley Department

More information

Non-Euclidean Independent Component Analysis and Oja's Learning

Non-Euclidean Independent Component Analysis and Oja's Learning Non-Euclidean Independent Component Analysis and Oja's Learning M. Lange 1, M. Biehl 2, and T. Villmann 1 1- University of Appl. Sciences Mittweida - Dept. of Mathematics Mittweida, Saxonia - Germany 2-

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

THE PRINCIPAL components analysis (PCA), also

THE PRINCIPAL components analysis (PCA), also IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 37, NO. 1, JANUARY 1999 297 Neural Networks for Seismic Principal Components Analysis Kou-Yuan Huang, Senior Member, IEEE Abstract The neural network,

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Fast principal component analysis using fixed-point algorithm

Fast principal component analysis using fixed-point algorithm Pattern Recognition Letters 28 (27) 1151 1155 www.elsevier.com/locate/patrec Fast principal component analysis using fixed-point algorithm Alok Sharma *, Kuldip K. Paliwal Signal Processing Lab, Griffith

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil

Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil Charles W. Anderson 1, Douglas C. Hittle 2, Alon D. Katz 2, and R. Matt Kretchmar 1 1 Department of Computer Science Colorado

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

LMS Algorithm Summary

LMS Algorithm Summary LMS Algorithm Summary Step size tradeoff Other Iterative Algorithms LMS algorithm with variable step size: w(k+1) = w(k) + µ(k)e(k)x(k) When step size µ(k) = µ/k algorithm converges almost surely to optimal

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

X/94 $ IEEE 1894

X/94 $ IEEE 1894 I Implementing Radial Basis Functions Using Bump-Resistor Networks John G. Harris University of Florida EE Dept., 436 CSE Bldg 42 Gainesville, FL 3261 1 harris@j upit er.ee.ufl.edu Abstract- Radial Basis

More information

Neural networks III: The delta learning rule with semilinear activation function

Neural networks III: The delta learning rule with semilinear activation function Neural networks III: The delta learning rule with semilinear activation function The standard delta rule essentially implements gradient descent in sum-squared error for linear activation functions. We

More information

CS545 Contents XVI. l Adaptive Control. l Reading Assignment for Next Class

CS545 Contents XVI. l Adaptive Control. l Reading Assignment for Next Class CS545 Contents XVI Adaptive Control Model Reference Adaptive Control Self-Tuning Regulators Linear Regression Recursive Least Squares Gradient Descent Feedback-Error Learning Reading Assignment for Next

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

Abstract. In this paper we propose recurrent neural networks with feedback into the input

Abstract. In this paper we propose recurrent neural networks with feedback into the input Recurrent Neural Networks for Missing or Asynchronous Data Yoshua Bengio Dept. Informatique et Recherche Operationnelle Universite de Montreal Montreal, Qc H3C-3J7 bengioy@iro.umontreal.ca Francois Gingras

More information

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Thore Graepel and Nicol N. Schraudolph Institute of Computational Science ETH Zürich, Switzerland {graepel,schraudo}@inf.ethz.ch

More information

Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network

Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network Fadwa DAMAK, Mounir BEN NASR, Mohamed CHTOUROU Department of Electrical Engineering ENIS Sfax, Tunisia {fadwa_damak,

More information

Experts. Lei Xu. Dept. of Computer Science, The Chinese University of Hong Kong. Dept. of Computer Science. Toronto, M5S 1A4, Canada.

Experts. Lei Xu. Dept. of Computer Science, The Chinese University of Hong Kong. Dept. of Computer Science. Toronto, M5S 1A4, Canada. An Alternative Model for Mixtures of Experts Lei u Dept of Computer Science, The Chinese University of Hong Kong Shatin, Hong Kong, Email lxu@cscuhkhk Michael I Jordan Dept of Brain and Cognitive Sciences

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

Comparative Performance Analysis of Three Algorithms for Principal Component Analysis

Comparative Performance Analysis of Three Algorithms for Principal Component Analysis 84 R. LANDQVIST, A. MOHAMMED, COMPARATIVE PERFORMANCE ANALYSIS OF THR ALGORITHMS Comparative Performance Analysis of Three Algorithms for Principal Component Analysis Ronnie LANDQVIST, Abbas MOHAMMED Dept.

More information

One-unit Learning Rules for Independent Component Analysis

One-unit Learning Rules for Independent Component Analysis One-unit Learning Rules for Independent Component Analysis Aapo Hyvarinen and Erkki Oja Helsinki University of Technology Laboratory of Computer and Information Science Rakentajanaukio 2 C, FIN-02150 Espoo,

More information

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Jakob Verbeek & Daan Wynen 206-09-22 Jakob Verbeek & Daan Wynen Unsupervised Neural Networks Outline Autoencoders Restricted) Boltzmann

More information

Hebb rule book: 'The Organization of Behavior' Theory about the neural bases of learning

Hebb rule book: 'The Organization of Behavior' Theory about the neural bases of learning PCA by neurons Hebb rule 1949 book: 'The Organization of Behavior' Theory about the neural bases of learning Learning takes place in synapses. Synapses get modified, they get stronger when the pre- and

More information

A New Look at the Power Method for Fast Subspace Tracking

A New Look at the Power Method for Fast Subspace Tracking Digital Signal Processing 9, 297 314 (1999) Article ID dspr.1999.0348, available online at http://www.idealibrary.com on A New Look at the Power Method for Fast Subspace Tracking Yingbo Hua,* Yong Xiang,*

More information

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ

More information

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. COGS Q250 Fall 2012 Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. For the first two questions of the homework you will need to understand the learning algorithm using the delta

More information

Automatic Rank Determination in Projective Nonnegative Matrix Factorization

Automatic Rank Determination in Projective Nonnegative Matrix Factorization Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and

More information

IN THIS PAPER, we consider a class of continuous-time recurrent

IN THIS PAPER, we consider a class of continuous-time recurrent IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 4, APRIL 2004 161 Global Output Convergence of a Class of Continuous-Time Recurrent Neural Networks With Time-Varying Thresholds

More information

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,

More information

1 Introduction Consider the following: given a cost function J (w) for the parameter vector w = [w1 w2 w n ] T, maximize J (w) (1) such that jjwjj = C

1 Introduction Consider the following: given a cost function J (w) for the parameter vector w = [w1 w2 w n ] T, maximize J (w) (1) such that jjwjj = C On Gradient Adaptation With Unit-Norm Constraints Scott C. Douglas 1, Shun-ichi Amari 2, and S.-Y. Kung 3 1 Department of Electrical Engineering, Southern Methodist University Dallas, Texas 75275 USA 2

More information

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity.

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity. Neural Computation, 1994 Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity. Francoise Beaufays and Eric A. Wan Abstract We show that signal

More information

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA)

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA) Fundamentals of Principal Component Analysis (PCA),, and Independent Vector Analysis (IVA) Dr Mohsen Naqvi Lecturer in Signal and Information Processing, School of Electrical and Electronic Engineering,

More information

squares based sparse system identification for the error in variables

squares based sparse system identification for the error in variables Lim and Pang SpringerPlus 2016)5:1460 DOI 10.1186/s40064-016-3120-6 RESEARCH Open Access l 1 regularized recursive total least squares based sparse system identification for the error in variables Jun

More information

Machine Learning and Adaptive Systems. Lectures 3 & 4

Machine Learning and Adaptive Systems. Lectures 3 & 4 ECE656- Lectures 3 & 4, Professor Department of Electrical and Computer Engineering Colorado State University Fall 2015 What is Learning? General Definition of Learning: Any change in the behavior or performance

More information

Geometry of Early Stopping in Linear Networks

Geometry of Early Stopping in Linear Networks Geometry of Early Stopping in Linear Networks Robert Dodier * Dept. of Computer Science University of Colorado Boulder, CO 80309 Abstract A theory of early stopping as applied to linear models is presented.

More information

arxiv: v3 [cs.lg] 18 Mar 2013

arxiv: v3 [cs.lg] 18 Mar 2013 Hierarchical Data Representation Model - Multi-layer NMF arxiv:1301.6316v3 [cs.lg] 18 Mar 2013 Hyun Ah Song Department of Electrical Engineering KAIST Daejeon, 305-701 hyunahsong@kaist.ac.kr Abstract Soo-Young

More information

Fast pruning using principal components

Fast pruning using principal components Oregon Health & Science University OHSU Digital Commons CSETech January 1993 Fast pruning using principal components Asriel U. Levin Todd K. Leen John E. Moody Follow this and additional works at: http://digitalcommons.ohsu.edu/csetech

More information

Validation of nonlinear PCA

Validation of nonlinear PCA Matthias Scholz. Validation of nonlinear PCA. (pre-print version) The final publication is available at www.springerlink.com Neural Processing Letters, 212, Volume 36, Number 1, Pages 21-3 Doi: 1.17/s1163-12-922-6

More information

Artificial Neural Network : Training

Artificial Neural Network : Training Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Revision: Neural Network

Revision: Neural Network Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

More information

Training Guidelines for Neural Networks to Estimate Stability Regions

Training Guidelines for Neural Networks to Estimate Stability Regions Training Guidelines for Neural Networks to Estimate Stability Regions Enrique D. Ferreira Bruce H.Krogh Department of Electrical and Computer Engineering Carnegie Mellon University 5 Forbes Av., Pittsburgh,

More information

Bayesian ensemble learning of generative models

Bayesian ensemble learning of generative models Chapter Bayesian ensemble learning of generative models Harri Valpola, Antti Honkela, Juha Karhunen, Tapani Raiko, Xavier Giannakopoulos, Alexander Ilin, Erkki Oja 65 66 Bayesian ensemble learning of generative

More information

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada.

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada. In Advances in Neural Information Processing Systems 8 eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press, 1996. Gaussian Processes for Regression Christopher K. I. Williams Neural Computing

More information

NON-FIXED AND ASYMMETRICAL MARGIN APPROACH TO STOCK MARKET PREDICTION USING SUPPORT VECTOR REGRESSION. Haiqin Yang, Irwin King and Laiwan Chan

NON-FIXED AND ASYMMETRICAL MARGIN APPROACH TO STOCK MARKET PREDICTION USING SUPPORT VECTOR REGRESSION. Haiqin Yang, Irwin King and Laiwan Chan In The Proceedings of ICONIP 2002, Singapore, 2002. NON-FIXED AND ASYMMETRICAL MARGIN APPROACH TO STOCK MARKET PREDICTION USING SUPPORT VECTOR REGRESSION Haiqin Yang, Irwin King and Laiwan Chan Department

More information

Neural networks: Unsupervised learning

Neural networks: Unsupervised learning Neural networks: Unsupervised learning 1 Previously The supervised learning paradigm: given example inputs x and target outputs t learning the mapping between them the trained network is supposed to give

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

Lab 5: 16 th April Exercises on Neural Networks

Lab 5: 16 th April Exercises on Neural Networks Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

Principal Component Analysis CS498

Principal Component Analysis CS498 Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

In: Proc. BENELEARN-98, 8th Belgian-Dutch Conference on Machine Learning, pp 9-46, 998 Linear Quadratic Regulation using Reinforcement Learning Stephan ten Hagen? and Ben Krose Department of Mathematics,

More information

Neural Network Control of Robot Manipulators and Nonlinear Systems

Neural Network Control of Robot Manipulators and Nonlinear Systems Neural Network Control of Robot Manipulators and Nonlinear Systems F.L. LEWIS Automation and Robotics Research Institute The University of Texas at Arlington S. JAG ANNATHAN Systems and Controls Research

More information

Eigenvoice Speaker Adaptation via Composite Kernel PCA

Eigenvoice Speaker Adaptation via Composite Kernel PCA Eigenvoice Speaker Adaptation via Composite Kernel PCA James T. Kwok, Brian Mak and Simon Ho Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Hong Kong [jamesk,mak,csho]@cs.ust.hk

More information

Learning Neural Networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex decision boundaries Variable size. Any boolean function can be represented. Hidden units can be interpreted as new features Deterministic

More information

Dictionary Learning for L1-Exact Sparse Coding

Dictionary Learning for L1-Exact Sparse Coding Dictionary Learning for L1-Exact Sparse Coding Mar D. Plumbley Department of Electronic Engineering, Queen Mary University of London, Mile End Road, London E1 4NS, United Kingdom. Email: mar.plumbley@elec.qmul.ac.u

More information

Subspace Methods for Visual Learning and Recognition

Subspace Methods for Visual Learning and Recognition This is a shortened version of the tutorial given at the ECCV 2002, Copenhagen, and ICPR 2002, Quebec City. Copyright 2002 by Aleš Leonardis, University of Ljubljana, and Horst Bischof, Graz University

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

AS Elementary Cybernetics. Lecture 4: Neocybernetic Basic Models

AS Elementary Cybernetics. Lecture 4: Neocybernetic Basic Models AS-74.4192 Elementary Cybernetics Lecture 4: Neocybernetic Basic Models Starting point when modeling real complex systems: Observation: Bottom-up approaches (studying the mechanisms alone) is futile Another

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

BASED on the minimum mean squared error, Widrow

BASED on the minimum mean squared error, Widrow 2122 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 46, NO. 8, AUGUST 1998 Total Least Mean Squares Algorithm Da-Zheng Feng, Zheng Bao, Senior Member, IEEE, and Li-Cheng Jiao, Senior Member, IEEE Abstract

More information

Neural Networks Lecture 4: Radial Bases Function Networks

Neural Networks Lecture 4: Radial Bases Function Networks Neural Networks Lecture 4: Radial Bases Function Networks H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Empirical Entropy Manipulation and Analysis

Empirical Entropy Manipulation and Analysis 2nd Joint Symposium on Neural Computation Proceedings 203 Empirical Entropy Manipulation and Analysis Paul Viola, Nicol N. Schraudolph, Terrence J. Sejnowski Computational Neurobiology Laboratory The Salk

More information

Multilayer Perceptrons (MLPs)

Multilayer Perceptrons (MLPs) CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1

More information

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

Greedy Layer-Wise Training of Deep Networks

Greedy Layer-Wise Training of Deep Networks Greedy Layer-Wise Training of Deep Networks Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle NIPS 2007 Presented by Ahmed Hefny Story so far Deep neural nets are more expressive: Can learn

More information

Deep unsupervised learning

Deep unsupervised learning Deep unsupervised learning Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea Unsupervised learning In machine learning, there are 3 kinds of learning paradigm.

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

An Ensemble Learning Approach to Nonlinear Dynamic Blind Source Separation Using State-Space Models

An Ensemble Learning Approach to Nonlinear Dynamic Blind Source Separation Using State-Space Models An Ensemble Learning Approach to Nonlinear Dynamic Blind Source Separation Using State-Space Models Harri Valpola, Antti Honkela, and Juha Karhunen Neural Networks Research Centre, Helsinki University

More information

PROJECTIVE NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATIONS TO FACIAL IMAGE PROCESSING

PROJECTIVE NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATIONS TO FACIAL IMAGE PROCESSING st Reading October 0, 200 8:4 WSPC/-IJPRAI SPI-J068 008 International Journal of Pattern Recognition and Artificial Intelligence Vol. 2, No. 8 (200) 0 c World Scientific Publishing Company PROJECTIVE NON-NEGATIVE

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Second-order Learning Algorithm with Squared Penalty Term

Second-order Learning Algorithm with Squared Penalty Term Second-order Learning Algorithm with Squared Penalty Term Kazumi Saito Ryohei Nakano NTT Communication Science Laboratories 2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 69-2 Japan {saito,nakano}@cslab.kecl.ntt.jp

More information

ORIENTED PCA AND BLIND SIGNAL SEPARATION

ORIENTED PCA AND BLIND SIGNAL SEPARATION ORIENTED PCA AND BLIND SIGNAL SEPARATION K. I. Diamantaras Department of Informatics TEI of Thessaloniki Sindos 54101, Greece kdiamant@it.teithe.gr Th. Papadimitriou Department of Int. Economic Relat.

More information

Optimal transfer function neural networks

Optimal transfer function neural networks Optimal transfer function neural networks Norbert Jankowski and Włodzisław Duch Department of Computer ethods Nicholas Copernicus University ul. Grudziądzka, 87 Toru ń, Poland, e-mail:{norbert,duch}@phys.uni.torun.pl

More information

Machine Learning and Adaptive Systems. Lectures 5 & 6

Machine Learning and Adaptive Systems. Lectures 5 & 6 ECE656- Lectures 5 & 6, Professor Department of Electrical and Computer Engineering Colorado State University Fall 2015 c. Performance Learning-LMS Algorithm (Widrow 1960) The iterative procedure in steepest

More information

w 1 output input &N 1 x w n w N =&2

w 1 output input &N 1 x w n w N =&2 ISSN 98-282 Technical Report L Noise Suppression in Training Data for Improving Generalization Akiko Nakashima, Akira Hirabayashi, and Hidemitsu OGAWA TR96-9 November Department of Computer Science Tokyo

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

CS545 Contents XVI. Adaptive Control. Reading Assignment for Next Class. u Model Reference Adaptive Control. u Self-Tuning Regulators

CS545 Contents XVI. Adaptive Control. Reading Assignment for Next Class. u Model Reference Adaptive Control. u Self-Tuning Regulators CS545 Contents XVI Adaptive Control u Model Reference Adaptive Control u Self-Tuning Regulators u Linear Regression u Recursive Least Squares u Gradient Descent u Feedback-Error Learning Reading Assignment

More information

Neural Network Identification of Non Linear Systems Using State Space Techniques.

Neural Network Identification of Non Linear Systems Using State Space Techniques. Neural Network Identification of Non Linear Systems Using State Space Techniques. Joan Codina, J. Carlos Aguado, Josep M. Fuertes. Automatic Control and Computer Engineering Department Universitat Politècnica

More information