Using LSTM Network in Face Classification Problems

Size: px
Start display at page:

Download "Using LSTM Network in Face Classification Problems"

Transcription

1 Using LS Network in Face Classification Problems Débora C. Corrêa, Denis H. P. Salvadeo, Alexandre L.. Levada, José H. Saito, Nelson D. A. ascarenhas, Jander oreira Departamento de Computação, Universidade Federal de São Carlos, SP-Brasil Instituto de Física, Universidade de São Paulo, São Carlos, SP-Brasil Abstract any researches have used convolutional neural networks for face classification tasks. Aiming to reduce the number of training samples as well training time, we propose to use a LS network and compare its performance with a standard LP network. Experiments with face images from CBCL database using PCA for feature extraction provided good results indicating that LS could learn properly even with reduced training set and its performance is much better than LP.. Introduction In the last decades, the human face has been explored in a variety of neural networks, computer vision and pattern recognition applications. any technical challenges exist in tasks involving face classifications problems. Some of the principal difficulties are []: large variability, highly complex nonlinear manifolds, high dimensionality and small sample size. any approaches use convolutional artificial neural networks for face classification tasks, as for example, the Neocognitron network [-3]. Our motivation to use other models of networks is to reduce the number of training samples and to improve classification performance. It is well known that, artificial neural networks (ANNs), also known as connectionist systems, represent a non-algorithmic computation form inspired on the human brain structure and processing [4]. In this non-algorithmic approach, computation is performed by a set of several simple processing units, the neurons, connected in a network and acting in parallel. he neurons are connected by weights, which store the network knowledge. o represent a desired solution of a problem, the ANNs perform training or learning phase, which consists of showing a set of examples (dataset training) to the network so that it can extract the necessary features to represent the given information [4]. In this work, our obective is first to use the LS (Long-Short erm emory) network for face classification tasks and check how good it is for this kind of application. Secondly, we compare the results obtained by LS with a traditional LP (ulti-layered Perceptron) network in order to show that LS networks are more capable to learn in presence of longdependence terms in the input data. Besides LS networks are faster than LPs in the learning phase. A first study about the use of these networks for face classification is reported in [5]. Statistical techniques based on principal component analysis (eigenfaces) are effective in reducing the dimensionality of face images. In this work we choose PCA as a tool for feature extraction. he remaining of the paper is organized as follows: section describes the LS network; section 3 describes the principal concepts about PCA, feature extraction technique; section 4 presents the proposed methodology and results and section 5 presents the final remarks and conclusions.. LS Neural Network he LS network is an alternative architecture for recurrent neural network inspired on the human memory systems. he principal motivation is to solve the vanishing-gradient problem by enforcing constant error flow through constant error carrousels (CECs) within special units, permitting then non-decaying error flow back into time [6] [7]. he CECs are units responsible to keep the error signal. his enables the network to learn important data and store them without degradation of long period of time. Also, as we verified in the experiments, this feature improve learning process, by decreasing training time and reducing the mean square error (training error). In a LS network there are memory blocks instead of hidden neurons (Figure (a)). A memory block is formed by one or more memory cells and a pair of adaptive, multiplicative gating units which gate input and output to all cells in the block (Figure (b)) controlling what the network is supposed to learn. An illustration of a memory block with one memory cell is shown in Figure.

2 principal axis of the hyper-ellipsoid that defines the distribution. Figure 3, obtained in [8], shows a -D example. he principal components are now the data proections in the two main axis, φ and φ. Besides, the variances of the components, given by the eigenvalues λ i, are distinct in most applications, with a considerable number of them so small, that they can be excluded. he selected principal components define the vector y. he obective is to find the new basis vectors, by optimizing certain mathematical criteria. Figure : Recurrent Neural Network with one recurrent hidden layer. Right: LS with memory blocks in the hidden layer (only one is shown) [6, pp.]. Figura 3: Graphical illustration of the Karhunen-Loève ransform for the -D Gaussian case. Figure : LS memory block with one memory cell [6, pp.]. LS networks have been used in many applications, such as speech recognition, function approximation, music composition, among other applications. For a detailed explanation of the LS network forward and backward pass see reference [6] and the work of Hochreiter & Schmidhuber [7]. 3. Principal Component Analysis Principal Component Analysis is the technique that implements the Karhunen-Loève ransform, or Hotteling ransform, a classical unsupervised second order method that uses the eigenvalues and eigenvectors of the covariance matrix to transform the feature space, creating orthogonal uncorrelated features. It is a second order method because all the necessary information is available directly from the covariance matrix of the mixture data and no information regarding probability distributions is needed. In the multivariate Gaussian case, the transformed feature space corresponds to the space generated by the athematically, we can express the rotation of the coordinate system defined by the Karhunen-Loève ransform by an orthonormal matrix Z =, S, with dimensions N N, with = [ w, w,..., w ] N representing the new system s axis and S = w+, w+,..., wn, denoting the axis N ( N ) of the eliminated components during the dimensionality reduction. he orthonormality conditions imply that w wk = 0 for k, e w wk = for = k. Now, it is possible to write the n-dimensional vector x in the new basis as: n n x = x w w = c w () where ( ) c is the inner product between x e w. hen, the new m-dimensional vector y is obtained by the following transformation: N y = x = c w w, w,..., w = [ c, c,..., c ] [ ] () hus, PCA seeks a linear transformation that

3 maximizes the variance of the proected data, or in mathematical terms, optimizes the following criterion, where C X is the covariance matrix of the observations: PCA J ( w) E y = = E y y = E c (3) However, it is known that c = x w, and therefore: PCA J ( w) = E wxx w (4) = we xx w = wcw X subect to w =, defining a optimization problem. he solution to this problem can be achieved using Lagrange multipliers. In this case, we have: PCA J w, γ = w C w γ w w (5) ( ) X ( ) Differentiating the above expression on and setting the result to zero, leads to the following result [9]: CXw = λ w (6) herefore, we have an eigenvector problem, which means that the vectors w of the new basis that maximize the variance of the transformed data are the eigenvectors of the covariance matrix C X. Another characteristic of PCA is that it minimizes the mean square error (SE) during the dimensionality reduction. In this sense, PCA tries to obtain a set of basis vectors ( < N), that span a -dimensional subspace in which the mean square error between this new representation and the original one is minimum. he proection of x in the subspace spanned by the w vectors, =,...,, is given by equation () and thus the SE criterion can be defined as: J ( w ) = E x ( x w ) w PCA SE Considering that the data is centralized (the mean vector is null) and due to the orthonormal basis, equation (7) is further simplified to: w (7) J ( w ) E x E ( x w ) = = PCA SE = X E x E w xx w E x w C w As the first term does not depend on minimize the SE, we have to maximize (8) w, in order to wc Xw. From equation (5) in the previous section, this optimization problem is solved by using Lagrange ultipliers. hus, inserting equation (7) in (8), leads to: PCA JSE ( w ) E = x γ (9) his result shows that in order to minimize the SE, we must choose the eigenvectors associated to the largest eigenvalues of the covariance matrix. Finally, the PCA criteria are very effective in terms of data compression, and often used for data classification. 4. ethodology We trained LS and LP networks to perform the following tasks involving face classification: face or nonface classification, face authentication and gender classification. o test and evaluate the performance of the networks for these tasks, we used images from the I-CBCL (Center for Biological and Computational Learning) face recognition database #, available at [0]. he CBCL FACE DAABASE # consists of a training set of 49 face images and a test set of 47 face images with spatial dimensions of 9 x 9 pixels. Each 9 x 9 image was transformed to a -D signal of 36 elements. We call this representation the face descriptor. Figure 4 shows some template faces of the training set. Figure 5 shows face descriptors corresponding to 4 images in the template set. Each one of them represents a 36-D input vector x. Figure 4: I-CBCL DAABASE # example faces

4 a) b) Figure 5: Examples of face descriptors for template patterns: a) emplate 3; b) emplate 4 he experiments were executed using ALAB. We applied PCA to reduce the dimensionality of the input patterns (36-D), so we can avoid problems caused by high dimensional data. We trained the LS and LP neural network using 5, 0 and 0 principal components in order to compare training time, the mean square error obtained by the networks in the training phase and the performance in the application phase. he number of units in each layer of the network depends on the number of principal components obtained when applying PCA. he experiments were executed in a computer having the following specifications: Intel Core Duo processor,.66 GHz, 667 Hz FSB, B L cache, GB DDR. Figure 6: LS network architecture [5]. he LP network model used in the experiments is illustrated in Figure 7. It has one input layer, one hidden layer and one output layer. It was trained with the standard back-propagation algorithm. 4.. LS and LP architecture he LS network model used in the experiments is illustrated in Figure 6 (only a limited subset of connections is shown). We observed that the network performs better if there are direct connections from input neurons to output ones (connections without weights); and if the memory cells are self-connect and their outputs also feed memory cells in the same memory block and other memory cells in different memory blocks. We used the weight initialization proposed by Correa, Levada and Saito []. In this work the behavior of the hidden units in a LS network in the application of function approximation is described in details. Based on this study they propose a method to initialize part of the network weights in order to improve and stabilize the training process. Figure 7: LS network architecture [4]. 4.. Experiments and Results For face or non-face classification, we selected 00 faces (50 images representing face templates and 50 images representing non-face templates) from the training set. he networks are trained to output if it receives a face and 0 otherwise. We stopped the LS network training when the SE was smaller than 0 -. hen, we trained the LP network with the same quantity of epochs needed for LS network. Cleary, the LP network got a much larger SE and spent more time in the training phase (see ables and ). Later, we chose another 00 images (50 from each class, face and non-

5 face) from the test set. As LP network could not learn properly, it obtained higher uncorrected classification rates. he obtained results are shown in ables and, where CC stands for correct classification, IC incorrect classification and CCR correct classification rate. able : Face and non-face: LS training and classification face x non-face LS Epochs SE ime (s) CC IC CCR 5 PCA % 0 PCA % 0 PCA % able : Face and non-face: LP training and classification face x LP non-face Epochs SE ime (s) CC IC CCR 5 PCA % 0 PCA % 0 PCA % For gender classification we selected 3 images (6 men faces and 6 women faces) in the training phase. he networks are trained to output if it receives a man face and 0 otherwise. Again, we trained both networks with the same number of epochs to compare their performance. In the application phase, it is presented to the networks 3 different faces of the same individuals (with different positions, expressions or illumination) to be classified. he obtained results for LS are illustrated in able 3. Although LP network obtained a correct classification rate of 50%, we noted that in all experiments all faces were classified as belonging to the same class. hat is, it classified all faces as man or all faces as woman, depending on the situation, as can be observed in able 4. able 3: Gender: LS training and classification gender LS Epochs SE ime (s) CC IC CCR 5 PCA % 0 PCA % 0 PCA % able 4: Gender: LP training and classification gender LP Epochs SE ime (s) CC IC CCR 5 PCA % 0 PCA % 0 PCA % For the authentication problem, we selected 50 faces of different individuals to represent the classes to be classified by LS and LP networks. Later, we chose another 50 faces of the same persons (with different positions, expressions or illumination) from the test set. We verified that LS can learn properly the classes even with one sample of each class and a reduced feature set, as presented by able 5. able 5: Authentication: LS and LP classification 0 principal comp. 0 principal comp. CC IC CCR CC IC CCR LS 48 96% % LP % 48 4% 5. Conclusions In this work, we proposed to use a LS network and compare its performance with a standard LP network for face classification problems. We compared the classification performance of these different network architectures. he LS network presented better performance in terms of training time, mean square error and correct classification rates in all the three proposed face classification tasks, showing that it is a powerful tool in pattern recognition applications, even if we are dealing with a reduced training set. 6. Acknowledgements We would like to thank FAPESP for the financial support through Alexandre L.. Levada student schoolarship (process nº 06/07-4) and also CNPq for the financial support through Denis H. P. Salvadeo and CAPES for Débora C. Corrêa student scholarships. 7. References [] A. K. Jain and S. Z. Li, Handbook of Face Recognition: Springer-Verlag New York, Inc., 005. [] K. Fukushima, "A Neural Network for Visual Pattern Recognition", Computer, vol., n. 3 pp , 980.

6 [3] C. O. Santana, J. H. Saito, "Reconhecimento Facial utilizando a Rede Neural Neocognitron", In: Proceedings of the III Workshop de Visão Computacional, 007 (in portuguese). [4] Haykin, S., Neural Networks: A comprehensive Foundation, Prentice Hall; nd edition (July 6, 998). [5] A. L.. Levada, D. C. Correa, D. H. P. Salvadeo, J. H. Saito, N. D. A. ascarenhas. Novel Approaches for Face Recognition: emplate-atching using Dynamic ime Warping and LS Neural Network Supervised Classification. In: Proceedings on the 5th International Conference on Systems, Signals and Image Processing. Bratislava : House SU, 008. p [6] F. Gers: Long Short-erm emory in Recurrent Neural Networks. PhD thesis (00) emory. Neural Computation, 9(8): , 997. [8] K. Fukunaga, An Introduction to Statistical Pattern Recognition, Second ed., Academic Press, 990. [9]. Y. Young,. W. Calvert, Classification, Estimation, and Pattern Recognition, Elsevier, 974. [0] CBCL Face Database #, I Center for Biological and Computation Learning. [] D. C. Corrêa, A. L.. Levada, J. H. Saito. Stabilizing and Improving the Learning Speed of - Layered LS Network. In: Proceedings on the 008 IEEE th International Conference on Computational Science and Engineering. IEEE Computer Society, p , 008. [7] S. Hochreiter, J. Schmidhuber, Long Short-erm

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

CSC 411 Lecture 12: Principal Component Analysis

CSC 411 Lecture 12: Principal Component Analysis CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

EECS490: Digital Image Processing. Lecture #26

EECS490: Digital Image Processing. Lecture #26 Lecture #26 Moments; invariant moments Eigenvector, principal component analysis Boundary coding Image primitives Image representation: trees, graphs Object recognition and classes Minimum distance classifiers

More information

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform KLT JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform Has many names cited in literature: Karhunen-Loève Transform (KLT); Karhunen-Loève Decomposition

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space

More information

Image Analysis. PCA and Eigenfaces

Image Analysis. PCA and Eigenfaces Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY [Gaurav, 2(1): Jan., 2013] ISSN: 2277-9655 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Face Identification & Detection Using Eigenfaces Sachin.S.Gurav *1, K.R.Desai 2 *1

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

A Novel Activity Detection Method

A Novel Activity Detection Method A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Comparative Analysis of ICA Based Features

Comparative Analysis of ICA Based Features International Journal of Emerging Engineering Research and Technology Volume 2, Issue 7, October 2014, PP 267-273 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Comparative Analysis of ICA Based Features

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Face Detection and Recognition

Face Detection and Recognition Face Detection and Recognition Face Recognition Problem Reading: Chapter 18.10 and, optionally, Face Recognition using Eigenfaces by M. Turk and A. Pentland Queryimage face query database Face Verification

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Deriving Principal Component Analysis (PCA)

Deriving Principal Component Analysis (PCA) -0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants Sheng Zhang erence Sim School of Computing, National University of Singapore 3 Science Drive 2, Singapore 7543 {zhangshe, tsim}@comp.nus.edu.sg

More information

An Application of Artificial Neural Networks in Brake Light Design for the Automobile Industry

An Application of Artificial Neural Networks in Brake Light Design for the Automobile Industry An Application of Artificial Neural Networks in Brake Light Design for the Automobile Industry ANTONIO VANDERLEI ORTEGA, IVAN NUNES DA SILVA Department of Electrical Engineering State University of São

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation.

Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation. Choosing Variables with a Genetic Algorithm for Econometric models based on Neural Networks learning and adaptation. Daniel Ramírez A., Israel Truijillo E. LINDA LAB, Computer Department, UNAM Facultad

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

Kazuhiro Fukui, University of Tsukuba

Kazuhiro Fukui, University of Tsukuba Subspace Methods Kazuhiro Fukui, University of Tsukuba Synonyms Multiple similarity method Related Concepts Principal component analysis (PCA) Subspace analysis Dimensionality reduction Definition Subspace

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Speaker Representation and Verification Part II. by Vasileios Vasilakakis Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation

More information

CSC321 Lecture 20: Autoencoders

CSC321 Lecture 20: Autoencoders CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Machine Learning Techniques

Machine Learning Techniques Machine Learning Techniques ( 機器學習技法 ) Lecture 13: Deep Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,

More information

Neural networks (NN) 1

Neural networks (NN) 1 Neural networks (NN) 1 Hedibert F. Lopes Insper Institute of Education and Research São Paulo, Brazil 1 Slides based on Chapter 11 of Hastie, Tibshirani and Friedman s book The Elements of Statistical

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

IN Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg

IN Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg IN 5520 30.10.18 Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg (anne@ifi.uio.no) 30.10.18 IN 5520 1 Literature Practical guidelines of classification

More information

Face Recognition Using Eigenfaces

Face Recognition Using Eigenfaces Face Recognition Using Eigenfaces Prof. V.P. Kshirsagar, M.R.Baviskar, M.E.Gaikwad, Dept. of CSE, Govt. Engineering College, Aurangabad (MS), India. vkshirsagar@gmail.com, madhumita_baviskar@yahoo.co.in,

More information

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Machine Learning Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1 / 47 Table of contents 1 Introduction

More information

A Novel PCA-Based Bayes Classifier and Face Analysis

A Novel PCA-Based Bayes Classifier and Face Analysis A Novel PCA-Based Bayes Classifier and Face Analysis Zhong Jin 1,2, Franck Davoine 3,ZhenLou 2, and Jingyu Yang 2 1 Centre de Visió per Computador, Universitat Autònoma de Barcelona, Barcelona, Spain zhong.in@cvc.uab.es

More information

PCA FACE RECOGNITION

PCA FACE RECOGNITION PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS Dimensionality Reduction Tzompanaki Katerina Dimensionality Reduction Unsupervised learning Goal: Find hidden patterns in the data. Used for Visualization Data compression

More information

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA) Symmetric Two Dimensional inear Discriminant Analysis (2DDA) Dijun uo, Chris Ding, Heng Huang University of Texas at Arlington 701 S. Nedderman Drive Arlington, TX 76013 dijun.luo@gmail.com, {chqding,

More information

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

ECE662: Pattern Recognition and Decision Making Processes: HW TWO ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Pattern Classification

Pattern Classification Pattern Classification Introduction Parametric classifiers Semi-parametric classifiers Dimensionality reduction Significance testing 6345 Automatic Speech Recognition Semi-Parametric Classifiers 1 Semi-Parametric

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition J. Uglov, V. Schetinin, C. Maple Computing and Information System Department, University of Bedfordshire, Luton,

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Clayton Aldern (Clayton_Aldern@brown.edu) Tyler Benster (Tyler_Benster@brown.edu) Carl Olsson (Carl_Olsson@brown.edu)

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Deep Residual. Variations

Deep Residual. Variations Deep Residual Network and Its Variations Diyu Yang (Originally prepared by Kaiming He from Microsoft Research) Advantages of Depth Degradation Problem Possible Causes? Vanishing/Exploding Gradients. Overfitting

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

Computational statistics

Computational statistics Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial

More information

Design Collocation Neural Network to Solve Singular Perturbed Problems with Initial Conditions

Design Collocation Neural Network to Solve Singular Perturbed Problems with Initial Conditions Article International Journal of Modern Engineering Sciences, 204, 3(): 29-38 International Journal of Modern Engineering Sciences Journal homepage:www.modernscientificpress.com/journals/ijmes.aspx ISSN:

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information