Pattern Recognition and Machine Learning. Artificial Neural networks

Size: px
Start display at page:

Download "Pattern Recognition and Machine Learning. Artificial Neural networks"

Transcription

1 Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3 Fully connected Networks What Window Size?... 4 Convolutional Neural Network for an iage Pooling... 6 AutoEncoders...7 The Sparsity Paraeter... 8 Kullback-Leibler Divergence... 9 An exaple of the Hidden Units Using notation and figures fro the Stanford Deep learning tutorial at:

2 Notation x d! X D { X! } {y } M N A feature. An observed or easured value. A vector of D features.! The nuber of diensions for the vector X Training saples for learning. The nuber of training saples. Spatial extent (size of the convolutional unit (NxN k a i,j A feature ap of k features at each iage position P(i,j D (l is the nuber of activation units in layer l. M ˆ " j = 1 M a (2 # j, Sparsity: The average activation of hidden unit j for the training set =1 D (2 # KL(" ˆ " j = " log "ˆ + (1$ "log 1$ " Kullback-Leibler Divergence " j 1$ ˆ " j 9-2

3 Convolutional Neural Networks. Convolutional Neural Networks take inspiration fro the Receptive Field odel of biological vision systes proposed by Hubel and Weisel in 1968 to explain the organization of the visual cortex. By probing the visual cortex of cats with electrodes, Hubel and Weisel discovered that the visual cortex was coposed of retinal aps of visual features. Each retinal ap is an iage of the pattern projected on the retina expressed with correlation of a visual feature at a particular size (scale and orientation. The collection retinal feature aps serves as an interediate representation for recognition. Fully connected Networks. A fully connected network is a network where each unit at level l+1 receives activations fro all units at level l. If there are D (l units at level l and D (l+1 units are level l+1 then a fully connected network requires learning D (l D (l+1 paraeters. While this ay be tractable for sall exaples, it quickly becoes excessive for practical probles, as found in coputer vision or speech recognition. For exaple, a typical iage ay have 1024 x 1024 = 2 20 pixels. If we assue, say a 512 x 512 =2 18 hidden units we have 2 38 paraeters to learn for a single class of iage pattern. Clearly this is not practical (and, in any case unnecessary A coon solution is to perfor learning using a liited size window, and to use all possible windows as training data. We have already seen this principle with the use of a sliding window approach with the Viola Jones detector. This leads to a technique where we fix a window size at NxN input units and use all possible, overlapping, windows of size NxN fro our training data to train the network. We then use the sae learned weights with every hidden cell. The resulting operation is equivalent to a convolution of the learned weights with the input signal. This approach is reasonable for iage and speech signals because both iages and speech signals have two interesting theoretical properties: They are local and stationary. 9-3

4 1 Local. Local eans that (ost of the required inforation can be found within a liited sized neighborhood of the signal. In fact, iage inforation tends to be ulti-scale, but this can be easily accoodated using ulti-scale signal techniques using a scale invariant pyraid. Such a representation is local at ultiple scales, with low-resolution scales providing context for higher resolution. This can be referred to as ulti-local. 2 Stationary. A stationary signal is a rando (unknown signal whose joint probability density function does not change when shifted in tie (speech or space (iage. Iage and Speech signals tend to have stationary statistics. Thus the sae processing can be applied to every possible (overlapping window. There are exceptions to both rules, but these can be handled with established techniques. What Window Size? What Window Size for a feature in a Convolutional Neural Network? This tends to depend on the proble. It is not uncoon to see tutorials proposal 5 x 5 iage windows. This ay be fine for illustration, but likely to be far fro optial. The ipressive results in category learning were obtained with a 2D iage window of 11 x 11. A iniu reasonable size is N=7. For saller windows, the Fourier transfor of the window function (the digital Sync function doinates the spectru and hides inforation. N=9 is a better choice as the window effects at are negligible. However, there is the proble of scale invariance. In coputer vision, patterns can occur at any scales. The solution is to use a scale invariant pyraid, providing invariant representation at sapled set of scaled. The window size should accoodate all scale changes between pyraid steps. This is generally between N=11 and N=15. The actual choice depends on available training data, coputing tie for learning and the scale steps used in the pyraid. For today, consider that N=11 is a reasonable size. 9-4

5 Convolutional Neural Network for an iage. Convolutional Neural Network (CNN can be used as feature detectors for iage analysis. When used with iages, a CNN provides K features at each pixelusing convolution with K hidden layers (or kernels. Each feature will be coputed as a weighted su of the pixels within an N x N window R(i,j. Let us assue an iage of J rows and I coluns, where each pixel has 3 color values. Each pixel (i,j is a color vector, P! (i, j, represented by 3 integers between 0 and 255 representing Red, Green and Blue. " R%! $ ' P (i, j = G $ ' # B& In the literature on CNNs, the colors are referred to as channels, and the nuber of channels is called the Depth (Dl. We will refer to the position of receptive field as its center. As a result we prefer odd values for N. The CNN will copute K filters (or kernels for each window R ij (u,v of size NxNx D that fits within the iage. If we consider the position of the window as its upper left corner, then for each position fro i=n/2, j=n/2 to i = I-N/2, j=j N/2: R ij (u,v = P(i+u-1, j+v-1 for i=n/2, j=n/2 to i = I-N/2, j=j N/2 The features can be learned as hidden layers using back-propagation, using a training set where each window is labeled with a target class. For exaple, with the FDDB training set, each possible NxN window R ij (u,v would be an training vector X! of size D (1 =D N 2. The target value y would be 1 if the center pixel of the window is in the face ellipse and 0 Otherwise. 9-5

6 The result is a feature ap of k features at each position (i,j (2 a i,j,k = f (" w (1 k (u,vr i,j (u,v + b (1 k u,v Note that written as a convolution, the forula would be a k (i, j = f (# w k (u,vr(i " u, j " v + b k u,v Hyperperaeters: CNNs are typically configured with a nuber of hyper-paraeters : Depth: This is the nuber D of channels for each iage pixel. Stride: Stride is the step size, S, between window positions. By default it ay be 1, but for larger windows, it is possible define larger step sizes. Spatial Extent: This is the size, NxN of the receptive field. Zero-Padding: Size of region at the border of the feature ap that is filled with zeros in order to preserve the iage size (typically N/2. Pooling Pooling is a for of non-linear down-sapling that partitions the iage into nonoverlapping regions and coputes a representative value for each region. Pooling is typically perfored over contiguous regions of the iage. In this case, the stride equals the pooling window size. The CNN feature iage is partitioned into sall non-overlapping rectangular regions, typically of size 2x2 or 4x4. Several non-linear functions can be used. These include Max, Average, Median, and Histogras. Max pooling sees to be the ost popular. For exaple, the SIFT operation, in Coputer vision, uses local histogras over a 4x4 window. 9-6

7 AutoEncoders An autoencoder is an unsupervised learning algorith that uses back-propagation to learning a sparse set of features for describing the training data. Rather than try to learn a target variable, y, the auto-encoder tries to learn to reconstruct the input X using a iniu set of features. The effect is siilar to the use of Principal coponents analysis on neighborhoods, (as we saw in lecture 4. However, the auto-encodeur provides a ore appropriate basis set for recognition, while principle coponents anaysis is appropriate for reconstruction. Using the notation fro our 2 layer network, given an input feature vector X! {w (1 ji,b (1 j } and {w (2 kj,b (2 k }such that a! = X ˆ " X! using as few weights as possible. Note that D = D (1. When the nuber of hidden units D (2 is less than the nuber of input units, D (1,! a = ˆ X "! X is necessarily an approxiation. The error for back-propagation for each unit is For each coponent x i, of the training saple X! " k, = a k, # a (1 i, = a k, # x i, 9-7

8 The Sparsity Paraeter The auto-encoder will learn weights subject to a sparseness constraints specified by a sparsity paraeter ˆ " j = ", typically set close to zero. The sparsity paraeter " is the average activation for the hidden units. Using the notation fro last week s lecture, the auto-encoder is described by: Level 1: level 2: level 3: "! $ X = $ $ # a j, x 1, " x D (1, D (1 % ' ' ' & (2 = f ( w (1 (1 " ji x i, + b j a k, i=1 D (2 = f (" w (2 kj a (2 + b (2 j, k Desired output "! $ a = " $ # a 1 a D % ' ' = X ˆ (! X, with error " k, & = a k, # a (1 i, = a k, # x i, The average activation ˆ " j is coputed as the average activation for each of the D (2 hidden units, to D (2 for the M training saples: M # ˆ " j = 1 M a (2 j, =1 The auto-encoder can be learned by back-propagation using a inor change to the cost function. L sparse (W, B; X!, y = 1 2 (! a " X! D 2 + #% (2 KL($ ˆ $ j s 2 where # KL(" ˆ " j is the Kullback-Leibler Divergence. and " controls the weight of the sparsity paraeter. (Don t panic - this is easy to do. 9-8

9 Kullback-Leibler Divergence The KL divergence between the desired and average activation is: D (2 D % # KL(" ˆ " j = " log "ˆ + (1$ "log 1$ " ( # (2 ' & " j 1$ ˆ " * j To incorporate the KL divergence into back propagation, we replace with " (2 j = #f (z (2 D j (2 $ w #z kj j k=1 (2 " k " (2 j = #f (z (2 j ( D (1 ( (2 w (2 $ #z kj " k + % & 'ˆ + 1& ' + + * * j k=1 ' j 1& ˆ ' - - j,, Note that you need the average activation ˆ " j to copute the correction. Thus you need to copute a forward pass on all the training data, before coputing the backpropagation on any of the training saples. This can be a proble if the nuber of training saples is large. This forces the hidden units to becoe approxiately orthogonal! Thus the hidden units act as a for of basis space for the input vectors. 9-9

10 Exaples of the Hidden Units given by Autoencoders An exaple of 100 hidden units learned by a sparse auto-encoder fro iages: When applied at ultiple levels and trained on face iages this can give recognizable features: or when trained on YouTube videos: Cats 9-10

11 when trained with car iages: 9-11

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Fourier Series Summary (From Salivahanan et al, 2002)

Fourier Series Summary (From Salivahanan et al, 2002) Fourier Series Suary (Fro Salivahanan et al, ) A periodic continuous signal f(t), - < t

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:

More information

Kernel based Object Tracking using Color Histogram Technique

Kernel based Object Tracking using Color Histogram Technique Kernel based Object Tracking using Color Histogra Technique 1 Prajna Pariita Dash, 2 Dipti patra, 3 Sudhansu Kuar Mishra, 4 Jagannath Sethi, 1,2 Dept of Electrical engineering, NIT, Rourkela, India 3 Departent

More information

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES ICONIC 2007 St. Louis, O, USA June 27-29, 2007 HIGH RESOLUTION NEAR-FIELD ULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR ACHINES A. Randazzo,. A. Abou-Khousa 2,.Pastorino, and R. Zoughi

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all Lecture 6 Introduction to kinetic theory of plasa waves Introduction to kinetic theory So far we have been odeling plasa dynaics using fluid equations. The assuption has been that the pressure can be either

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Slide10. Haykin Chapter 8: Principal Components Analysis. Motivation. Principal Component Analysis: Variance Probe

Slide10. Haykin Chapter 8: Principal Components Analysis. Motivation. Principal Component Analysis: Variance Probe Slide10 Motivation Haykin Chapter 8: Principal Coponents Analysis 1.6 1.4 1.2 1 0.8 cloud.dat 0.6 CPSC 636-600 Instructor: Yoonsuck Choe Spring 2015 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 How can we

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

The linear sampling method and the MUSIC algorithm

The linear sampling method and the MUSIC algorithm INSTITUTE OF PHYSICS PUBLISHING INVERSE PROBLEMS Inverse Probles 17 (2001) 591 595 www.iop.org/journals/ip PII: S0266-5611(01)16989-3 The linear sapling ethod and the MUSIC algorith Margaret Cheney Departent

More information

Feedforward Networks

Feedforward Networks Feedforward Networks Gradient Descent Learning and Backpropagation Christian Jacob CPSC 433 Christian Jacob Dept.of Coputer Science,University of Calgary CPSC 433 - Feedforward Networks 2 Adaptive "Prograing"

More information

Measuring orbital angular momentum superpositions of light by mode transformation

Measuring orbital angular momentum superpositions of light by mode transformation CHAPTER 7 Measuring orbital angular oentu superpositions of light by ode transforation In chapter 6 we reported on a ethod for easuring orbital angular oentu (OAM) states of light based on the transforation

More information

Convolutional Neural Networks. Srikumar Ramalingam

Convolutional Neural Networks. Srikumar Ramalingam Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Forecasting Financial Indices: The Baltic Dry Indices

Forecasting Financial Indices: The Baltic Dry Indices International Journal of Maritie, Trade & Econoic Issues pp. 109-130 Volue I, Issue (1), 2013 Forecasting Financial Indices: The Baltic Dry Indices Eleftherios I. Thalassinos 1, Mike P. Hanias 2, Panayiotis

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Feedforward Networks. Gradient Descent Learning and Backpropagation. Christian Jacob. CPSC 533 Winter 2004

Feedforward Networks. Gradient Descent Learning and Backpropagation. Christian Jacob. CPSC 533 Winter 2004 Feedforward Networks Gradient Descent Learning and Backpropagation Christian Jacob CPSC 533 Winter 2004 Christian Jacob Dept.of Coputer Science,University of Calgary 2 05-2-Backprop-print.nb Adaptive "Prograing"

More information

Page 1 Lab 1 Elementary Matrix and Linear Algebra Spring 2011

Page 1 Lab 1 Elementary Matrix and Linear Algebra Spring 2011 Page Lab Eleentary Matri and Linear Algebra Spring 0 Nae Due /03/0 Score /5 Probles through 4 are each worth 4 points.. Go to the Linear Algebra oolkit site ransforing a atri to reduced row echelon for

More information

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Neural Networks 2. 2 Receptive fields and dealing with image inputs CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There

More information

Computation in the Higher Visual Cortices: Map-Seeking Circuit Theory and Application to Machine Vision

Computation in the Higher Visual Cortices: Map-Seeking Circuit Theory and Application to Machine Vision Coputation in the Higher Visual Cortices: Map-Seeking Circuit Theory and Application to Machine Vision David Arathorn Center for Coputational Biology Montana State University and General Intelligence Corporation

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

An Algorithm for Quantization of Discrete Probability Distributions

An Algorithm for Quantization of Discrete Probability Distributions An Algorith for Quantization of Discrete Probability Distributions Yuriy A. Reznik Qualco Inc., San Diego, CA Eail: yreznik@ieee.org Abstract We study the proble of quantization of discrete probability

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

N-Point. DFTs of Two Length-N Real Sequences

N-Point. DFTs of Two Length-N Real Sequences Coputation of the DFT of In ost practical applications, sequences of interest are real In such cases, the syetry properties of the DFT given in Table 5. can be exploited to ake the DFT coputations ore

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Hand Written Digit Recognition Using Backpropagation Neural Network on Master-Slave Architecture

Hand Written Digit Recognition Using Backpropagation Neural Network on Master-Slave Architecture J.V.S.Srinivas et al./ International Journal of Coputer Science & Engineering Technology (IJCSET) Hand Written Digit Recognition Using Backpropagation Neural Network on Master-Slave Architecture J.V.S.SRINIVAS

More information

Abstract. 1 Introduction

Abstract. 1 Introduction A Cortically-Plausible Inverse Proble Solving Method Applied to Recognizing Static and Kineatic 3D Objects David W. Arathorn Center for Coputational Biology, Montana State University Bozean, MT 5977 dwa@cns.ontana.edu

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

Feedforward Networks

Feedforward Networks Feedforward Neural Networks - Backpropagation Feedforward Networks Gradient Descent Learning and Backpropagation CPSC 533 Fall 2003 Christian Jacob Dept.of Coputer Science,University of Calgary Feedforward

More information

Efficient Filter Banks And Interpolators

Efficient Filter Banks And Interpolators Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Fault Diagnosis of Planetary Gear Based on Fuzzy Entropy of CEEMDAN and MLP Neural Network by Using Vibration Signal

Fault Diagnosis of Planetary Gear Based on Fuzzy Entropy of CEEMDAN and MLP Neural Network by Using Vibration Signal ITM Web of Conferences 11, 82 (217) DOI: 1.151/ itconf/2171182 IST217 Fault Diagnosis of Planetary Gear Based on Fuzzy Entropy of CEEMDAN and MLP Neural Networ by Using Vibration Signal Xi-Hui CHEN, Gang

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Structured Illumination Super-Resolution Imaging Achieved by Two Steps based on the Modulation of Background Light Field

Structured Illumination Super-Resolution Imaging Achieved by Two Steps based on the Modulation of Background Light Field 017 nd International Seinar on Applied Physics, Optoelectronics and Photonics (APOP 017) ISBN: 978-1-60595-5-3 Structured Illuination Super-Resolution Iaging Achieved by Two Steps based on the Modulation

More information

A method to determine relative stroke detection efficiencies from multiplicity distributions

A method to determine relative stroke detection efficiencies from multiplicity distributions A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique

More information

Lean Walsh Transform

Lean Walsh Transform Lean Walsh Transfor Edo Liberty 5th March 007 inforal intro We show an orthogonal atrix A of size d log 4 3 d (α = log 4 3) which is applicable in tie O(d). By applying a rando sign change atrix S to the

More information

On the Sliding-Window Representation in Digital Signal Processing

On the Sliding-Window Representation in Digital Signal Processing 868 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, SIGNAL AND PROCESSING, VOL. ASSP-33, Nb. 4, AUGUST 1985 On the Sliding-Window Representation in Digital Signal Processing MARTIN J. BASTIAANS Abstract-The short-tie

More information

CHAPTER 19: Single-Loop IMC Control

CHAPTER 19: Single-Loop IMC Control When I coplete this chapter, I want to be able to do the following. Recognize that other feedback algoriths are possible Understand the IMC structure and how it provides the essential control features

More information

arxiv: v2 [cs.lg] 30 Mar 2017

arxiv: v2 [cs.lg] 30 Mar 2017 Batch Renoralization: Towards Reducing Minibatch Dependence in Batch-Noralized Models Sergey Ioffe Google Inc., sioffe@google.co arxiv:1702.03275v2 [cs.lg] 30 Mar 2017 Abstract Batch Noralization is quite

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Gaussian Illuminants and Reflectances for Colour Signal Prediction

Gaussian Illuminants and Reflectances for Colour Signal Prediction Gaussian Illuinants and Reflectances for Colour Signal Prediction Haidreza Mirzaei, Brian Funt; Sion Fraser University; Vancouver, BC, Canada Abstract An alternative to the von Kries scaling underlying

More information

CHARACTER RECOGNITION USING A SELF-ADAPTIVE TRAINING

CHARACTER RECOGNITION USING A SELF-ADAPTIVE TRAINING CHARACTER RECOGNITION USING A SELF-ADAPTIVE TRAINING Dr. Eng. Shasuddin Ahed $ College of Business and Econoics (AACSB Accredited) United Arab Eirates University, P O Box 7555, Al Ain, UAE. and $ Edith

More information

Introduction to Robotics (CS223A) (Winter 2006/2007) Homework #5 solutions

Introduction to Robotics (CS223A) (Winter 2006/2007) Homework #5 solutions Introduction to Robotics (CS3A) Handout (Winter 6/7) Hoework #5 solutions. (a) Derive a forula that transfors an inertia tensor given in soe frae {C} into a new frae {A}. The frae {A} can differ fro frae

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

ZISC Neural Network Base Indicator for Classification Complexity Estimation

ZISC Neural Network Base Indicator for Classification Complexity Estimation ZISC Neural Network Base Indicator for Classification Coplexity Estiation Ivan Budnyk, Abdennasser Сhebira and Kurosh Madani Iages, Signals and Intelligent Systes Laboratory (LISSI / EA 3956) PARIS XII

More information

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

Figure 1: Equivalent electric (RC) circuit of a neurons membrane Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of

More information

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.

More information

Generalized Sampling Theorem for Bandpass Signals

Generalized Sampling Theorem for Bandpass Signals Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volue 26, Article ID 59587, Pages 6 DOI 55/ASP/26/59587 Generalized Sapling Theore for Bandpass Signals Ales Prokes Departent

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Lec 05 Arithmetic Coding

Lec 05 Arithmetic Coding Outline CS/EE 5590 / ENG 40 Special Topics (7804, 785, 7803 Lec 05 Arithetic Coding Lecture 04 ReCap Arithetic Coding About Hoework- and Lab Zhu Li Course Web: http://l.web.ukc.edu/lizhu/teaching/06sp.video-counication/ain.htl

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

Hyperbolic Horn Helical Mass Spectrometer (3HMS) James G. Hagerman Hagerman Technology LLC & Pacific Environmental Technologies April 2005

Hyperbolic Horn Helical Mass Spectrometer (3HMS) James G. Hagerman Hagerman Technology LLC & Pacific Environmental Technologies April 2005 Hyperbolic Horn Helical Mass Spectroeter (3HMS) Jaes G Hageran Hageran Technology LLC & Pacific Environental Technologies April 5 ABSTRACT This paper describes a new type of ass filter based on the REFIMS

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

be the original image. Let be the Gaussian white noise with zero mean and variance σ. g ( x, denotes the motion blurred image with noise.

be the original image. Let be the Gaussian white noise with zero mean and variance σ. g ( x, denotes the motion blurred image with noise. 3rd International Conference on Multiedia echnology(icm 3) Estiation Forula for Noise Variance of Motion Blurred Iages He Weiguo Abstract. he ey proble in restoration of otion blurred iages is how to get

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

UNSUPERVISED LEARNING

UNSUPERVISED LEARNING UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training

More information

Mathematical Model and Algorithm for the Task Allocation Problem of Robots in the Smart Warehouse

Mathematical Model and Algorithm for the Task Allocation Problem of Robots in the Smart Warehouse Aerican Journal of Operations Research, 205, 5, 493-502 Published Online Noveber 205 in SciRes. http://www.scirp.org/journal/ajor http://dx.doi.org/0.4236/ajor.205.56038 Matheatical Model and Algorith

More information