Artificial Neural Networks

Similar documents
Feed-forward Network Functions

Neural Networks and the Back-propagation Algorithm

Cheng Soon Ong & Christian Walder. Canberra February June 2018

NEURAL NETWORKS

Reading Group on Deep Learning Session 1

Artificial Neural Networks. MGS Lecture 2

CSC321 Lecture 5: Multilayer Perceptrons

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Artificial Intelligence

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Course 395: Machine Learning - Lectures

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Neural Networks and Deep Learning

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Feedforward Neural Nets and Backpropagation

Revision: Neural Network

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Introduction to Machine Learning Spring 2018 Note Neural Networks

Multilayer Perceptron

Lecture 4: Perceptrons and Multilayer Perceptrons

Artificial Neural Network

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

AI Programming CS F-20 Neural Networks

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Regularization in Neural Networks

Multilayer Neural Networks

Logistic Regression & Neural Networks

Artificial Neural Networks. Edward Gatt

Deep Feedforward Networks

Artificial Neural Networks

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Intro to Neural Networks and Deep Learning

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Neural Network Training

Introduction to Neural Networks

Artificial Neural Networks

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

y(x n, w) t n 2. (1)

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 13 Mar 1, 2018

4. Multilayer Perceptrons

Neural networks. Chapter 19, Sections 1 5 1

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation

Deep Feedforward Networks. Sargur N. Srihari

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Neural Networks (and Gradient Ascent Again)

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Multilayer Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Artifical Neural Networks

CSC 411 Lecture 10: Neural Networks

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

The Multi-Layer Perceptron

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Supervised Learning. George Konidaris

Lecture 17: Neural Networks and Deep Learning

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Introduction to Neural Networks

From perceptrons to word embeddings. Simon Šuster University of Groningen

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Machine Learning (CSE 446): Neural Networks

Unit III. A Survey of Neural Network Model

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Artificial Neuron (Perceptron)

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Nonlinear Classification

Simple Neural Nets For Pattern Classification

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Pattern Recognition and Machine Learning

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Neural Networks DWML, /25

Neural Networks: Introduction

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Feedforward Neural Networks

CSC242: Intro to AI. Lecture 21

Learning from Data: Multi-layer Perceptrons

Simple neuron model Components of simple neuron

Lecture 3 Feedforward Networks and Backpropagation

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Linear & nonlinear classifiers

Intelligent Systems Discriminative Learning, Neural Networks

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Data Mining Part 5. Prediction

Neural Networks. Intro to AI Bert Huang Virginia Tech

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Learning and Memory in Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural networks. Chapter 20, Section 5 1

18.6 Regression and Classification with Linear Models

Artificial Neural Networks

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

ECE521 Lectures 9 Fully Connected Neural Networks

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Machine Learning. Neural Networks

Deep Neural Networks (1) Hidden layers; Back-propagation

Transcription:

Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks A. Riascos

Contenido Introduction ANN in Action Final Observations Application: Poverty Detection 1 Introduction 2 ANN in Action 3 Final Observations 4 Application: Poverty Detection Artificial Neural Networks A. Riascos

Introduction Introduction ANN in Action Final Observations Application: Poverty Detection ANN is a highly nonlinear and tractable machine learning algorithm. It is an universal approximator. Advances in calibration methodologies in terms of speed and enormous success for solving pattern recognition problems such as image recognition, voice translation, etc. The success solving these tasks has put ANN and Deep Neural ANN at the center stage of research and industry applications. Artificial Neural Networks A. Riascos

Introduction Introduction ANN in Action Final Observations Application: Poverty Detection ANN is a highly nonlinear and tractable machine learning algorithm. It is an universal approximator. Advances in calibration methodologies in terms of speed and enormous success for solving pattern recognition problems such as image recognition, voice translation, etc. The success solving these tasks has put ANN and Deep Neural ANN at the center stage of research and industry applications. Artificial Neural Networks A. Riascos

Introduction Introduction ANN in Action Final Observations Application: Poverty Detection ANN is a highly nonlinear and tractable machine learning algorithm. It is an universal approximator. Advances in calibration methodologies in terms of speed and enormous success for solving pattern recognition problems such as image recognition, voice translation, etc. The success solving these tasks has put ANN and Deep Neural ANN at the center stage of research and industry applications. Artificial Neural Networks A. Riascos

Basic ANN: Feedforward Neural Net The most basic ANN is called Feedforward Neural Net or Multilayer Perceptor. The logistic model is a special case, you already know the simplest ANN! These networks are difficult to optimize globally. A key idea to carry a computationally efficient optimization is the idea of backpropagation.

Basic ANN: Feedforward Neural Net The most basic ANN is called Feedforward Neural Net or Multilayer Perceptor. The logistic model is a special case, you already know the simplest ANN! These networks are difficult to optimize globally. A key idea to carry a computationally efficient optimization is the idea of backpropagation.

Basic ANN: Feedforward Neural Net The most basic ANN is called Feedforward Neural Net or Multilayer Perceptor. The logistic model is a special case, you already know the simplest ANN! These networks are difficult to optimize globally. A key idea to carry a computationally efficient optimization is the idea of backpropagation.

Basic ANN: Feedforward Neural Net The most basic ANN is called Feedforward Neural Net or Multilayer Perceptor. The logistic model is a special case, you already know the simplest ANN! These networks are difficult to optimize globally. A key idea to carry a computationally efficient optimization is the idea of backpropagation.

Basic ANN: Two layers A two layer ANN can be represented by the following graph. There is one hidden layer and one output layer. Each layer may have an arbitrary number of units (i.e., neurons) 5. NEURAL NETWORKS Figure 5.1 Network diagram for the twolayer neural network corresponding to (5.7). The input, hidden, and output variables are represented by nodes, and x D the weight parameters are represented by links between the nodes, in which the bias parameters are denoted by links inputs coming from additional input and hidden variables x0 and z0. Arrows denote the direction of information flow through x 1 the network during forward propagation. x 0 hidden units z M w (1) MD z 1 w (2) KM w (2) 10 y K outputs y 1 z 0 and follows the same considerations as for linear models discussed in Chapters 3 and 4. Thus for standard regression problems, the activation function is the identity so that y k = a k. Similarly, for multiple binary classification problems, each output unit activation is transformed using a logistic sigmoid function so that

Basic ANN: Two layers Assume you have D input variables: {x 1,..., x D } and M output variables {y 1,..., y M }. Let h 1, h 2 be activations functions (h 2 is the activation function of the output layer). The following equation describe a two layer feed-forward ANN:

Basic ANN: Two layers a (1) j = D i=1 w (1) ji x i + w (1) j0 z (1) j = h 1 (a (1) j ) a (2) j = M i=1 w (2) ji z (1) i + w (2) j0 z (2) j = h 2 (a (2) j ) where w (1) j0, w (2) j0 represent the bias (constant) in each layer.

Basic ANN: Two layers Let z (0) i = x i, y j = z (2) j If we define the additional variables x 0 = 1, z (1) 0 = 1 and z (2) 0 = 1, we can rewrite the equations describing the ANN as : M y k (x, w) = h 2 ( w (2) kj h 1 ( j=0 D i=0 w (1) ji x i ))

Basic ANN: Two layers ORKS This two layer terminology reflects the fact that we have to estimate two set of parameters (the linear weights at each layer). iagram for the tworal network correto (5.7). The input, nd output variables ented by nodes, and parameters are repy links between the which the bias pare denoted by links om additional input n variables x 0 and s denote the direcrmation flow through rk during forward n. x D inputs x 1 x 0 w (1) MD hidden units z M z 1 w (2) KM w (2) 10 y K outputs y 1 z 0

Basic ANN: Universal Approximation Property An ANN with two layers and linear output activation functions can approximate any continuos function on a bounded domain (more precisely, on a compact domain) with enough neurons. This is true for many activation functions in the hidden layer (though not for polynomials).

Approximation properties examples (2 layers, 3 neurons) 5.1. Feed-forward Network Functions 231 Figure 5.3 Illustration of the capability of a multilayer perceptron to approximate four different functions comprising (a) f(x) =x 2, (b) f(x) = sin(x), (c), f(x) = x, and (d) f(x) = H(x) where H(x) is the Heaviside step function. In each case, N = 50 data points, shown as blue dots, have been sampled uniformly in x over the interval ( 1, 1) and the corresponding values of f(x) evaluated. These data points are then used to train a twolayer network having 3 hidden units with tanh activation functions and linear output units. The resulting network functions are shown by the red curves, and the outputs of the three hidden units are shown by the three dashed curves. (a) (b) (c) (d) will show that there exist effective solutions to this problem based on both maximum likelihood and Bayesian approaches. The capability of a two-layer network to model a broad range of functions is illustrated in Figure 5.3. This figure also shows how individual hidden units work collaboratively to approximate the final function. The role of hidden units in a simple classification problem is illustrated in Figure 5.4 using the synthetic classification data set described in Appendix A. Simulated data 50 blue dots. ANN with 2 layers, 3 neurons, tanh activation in the hidden layer and linear activation in output layer. Dotted lines show the results of the 3 neurons. 5.1.1 Weight-space symmetries

Classifying using ANN AL NETWORKS Example of the solution of a simple twoclass classification problem involving synthetic data using a neural network having two inputs, two hidden units with 3 2 tanh activation functions, and a single output having a logistic sigmoid activation 1 function. The dashed blue lines show the z =0.5 contours for each of 0 the hidden units, and the red line shows the y =0.5 decision surface for the network. 1 For comparison, the green line denotes the optimal decision boundary computed from the distributions used to 2 generate the data. 2 1 0 1 2 symmetries, and thus any given weight vector will be one of a set 2 M equivalent weight vectors. Similarly, imagine that we interchange the values of all of the weights (and the bias) leading both into and out of a particular hidden unit with the corresponding values of the weights (and bias) associated with a different hidden unit. Again, this clearly leaves theach network neuron. input output mapping function unchanged, but it corresponds to a different choice of weight vector. For M hidden units, any given weight vector will belong to a set of M! equivalent weight vectors associated with this interchange symmetry, corresponding to the M! different orderings of the hidden units. classifier. The network will therefore have an overall weight-space symmetry factor of M!2 M. For networks with more than two layers of weights, the total level of symmetry will be given by the product of such factors, one for each layer of hidden units. ANN binary classification problem with two layers and two neurons. Dotted lines are the classification hypersurfaces of Red line de ANN classification result and green line, Bayesian

Contenido Introduction ANN in Action Final Observations Application: Poverty Detection 1 Introduction 2 ANN in Action 3 Final Observations 4 Application: Poverty Detection Artificial Neural Networks A. Riascos

Introduction ANN in Action Final Observations Application: Poverty Detection ANN in Action: Data Artificial Neural Networks A. Riascos

ANN in Action: Logistic (One layer with sigmoid activation)

ANN in Action: Two layers

Contenido Introduction ANN in Action Final Observations Application: Poverty Detection 1 Introduction 2 ANN in Action 3 Final Observations 4 Application: Poverty Detection Artificial Neural Networks A. Riascos

Final Observations Introduction ANN in Action Final Observations Application: Poverty Detection ANN can be trained using standard techniques (gradient descent, etc.). The key idea is how to calculate derivatives of the loss function with respect to parameters: use the structure of the net and chain rule (i.e., backpropagation). Deep ANN are ANN with many layers and probably, many neurons per layer. Adding layers allows for a simpler representation of any continuous representation. Learning data representations (features) is possible by extracting intermediate outputs from hidden layers. Artificial Neural Networks A. Riascos

Contenido Introduction ANN in Action Final Observations Application: Poverty Detection 1 Introduction 2 ANN in Action 3 Final Observations 4 Application: Poverty Detection Artificial Neural Networks A. Riascos

Introduction ANN in Action Final Observations Application: Poverty Detection Application: Nightlight Combining satellite imagery and ML to predict poverty. Jean et.al. Science, 2016. Artificial Neural Networks A. Riascos

Application: Learning Representations

Application: Prediction