Neural Networks. Nethra Sambamoorthi, Ph.D. Jan CRMportals Inc., Nethra Sambamoorthi, Ph.D. Phone:

Similar documents
Introduction to Neural Networks

Artificial Neural Networks Examination, June 2005

Unit III. A Survey of Neural Network Model

ECE521 Lectures 9 Fully Connected Neural Networks

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

PATTERN CLASSIFICATION

Artificial Neural Networks

Artificial Neural Networks. MGS Lecture 2

Data Mining Part 5. Prediction

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks The Introduction

From perceptrons to word embeddings. Simon Šuster University of Groningen

Simple Neural Nets For Pattern Classification

Artificial Neural Network and Fuzzy Logic

Multilayer Perceptron

Artificial Neural Network : Training

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Mining Classification Knowledge

Lecture 5: Logistic Regression. Neural Networks

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Course in Data Science

Reading Group on Deep Learning Session 1

Introduction to Convolutional Neural Networks (CNNs)

ECE521 Lecture 7/8. Logistic Regression

Artificial Neural Networks Examination, March 2004

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Logistic Regression & Neural Networks

Lecture 4: Perceptrons and Multilayer Perceptrons

Artificial Neural Networks. Historical description

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Pattern Recognition and Machine Learning

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Part 8: Neural Networks

ECE521 week 3: 23/26 January 2017

CSC242: Intro to AI. Lecture 21

Artificial Neural Networks (ANN)

Artificial Neural Network

Mining Classification Knowledge

4. Multilayer Perceptrons

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Lecture 6. Notes on Linear Algebra. Perceptron

Artificial Intelligence

Computational statistics

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artifical Neural Networks

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Lecture 4: Feed Forward Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Holdout and Cross-Validation Methods Overfitting Avoidance

Artificial Neural Networks

Cheng Soon Ong & Christian Walder. Canberra February June 2018

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Overfitting, Bias / Variance Analysis

Neural networks. Chapter 20, Section 5 1

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

STA 4273H: Statistical Machine Learning

Linear discriminant functions

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18

Feedforward Neural Nets and Backpropagation

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Single layer NN. Neuron Model

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Deep Feedforward Networks. Seung-Hoon Na Chonbuk National University

Linear Models in Machine Learning

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Deep Feedforward Networks

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Artificial Neural Networks. Edward Gatt

ADALINE for Pattern Classification

Lecture 7 Artificial neural networks: Supervised learning

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Deep Neural Networks

Neural Networks (Part 1) Goals for the lecture

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Introduction to Neural Networks

CS:4420 Artificial Intelligence

Neural Networks Introduction

Machine Learning Linear Classification. Prof. Matteo Matteucci

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Neural networks. Chapter 19, Sections 1 5 1

Practicals 5 : Perceptron

MLPR: Logistic Regression and Neural Networks

Recent Advances in Bayesian Inference Techniques

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Linear Regression (continued)

Machine Learning Lecture 12

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Neural Network Based Response Surface Methods a Comparative Study

Artificial Neural Networks

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Transcription:

Neural Networks Nethra Sambamoorthi, Ph.D Jan 2003 CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

What? Saying it Again in Different ways Artificial neural network (ANN) models are a collection of computational modeling methods for prediction of yes/no response (or multinomial classes) or prediction of continuous response (or multiple dependent responses) a la the decision making of collection of neurons in the brain We will accept it as a way of modeling and learn from it to further the process of modeling activities CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

What Is Covered? You will get an overview of methods. Single layer ANN Two layer ANN Basics of training the network Back-propagation Single layer (actual calc) Comments (other methods) Statistical relationships and equivalent statistical jargons The reference book name whose graphs have been used is missing as this is a old presentation made available through document retrieval research. When I locate the book it will be mentioned apologies. A surprise git for any one who could identify the book name. CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

A Great Reference Source http://zernike.uwinnipeg.ca/jargon - an article written by Warren Sarle of SAS Inc., This will take you to a whole world of neural network related literature and its comparision to statisiticians jargon CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

What? Saying it Again in Different ways Artificial neural network (ANN) models are a collection of computational modeling methods for prediction of yes/no response (or multinomial classes) or prediction of continuous response (or multiple dependent responses) a la the decision making of collection of neurons in the brain We will accept it as a way of modeling and learn from it to further the process of modeling activities CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

Back Propagation Steps Set the weights randomly before we begin the iteration process Apply the first training sample point (calculation of weights is done using one sample point for one weighting coefficient at a time Calculate the output of network Forward pass Calculate the error between network output and the target value for that sample Adjust the weight of the vector to minimize the errors Repeat the above steps for each input vector in the training set to adjust the weight coefficients until the error is acceptably low reverse pass CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

A Great Reference Source http://zernike.uwinnipeg.ca/jargon - an article written by Warren Sarle of SAS Inc., This will take you to a whole world of neural network related literature by statisiticians CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com

http://zernike.uwinnipeg.ca/jargon Neural Network and Statistical Jargon ===================================== Warren S. Sarle saswss@unx.sas.com Apr 29, 1996 URL: ftp://ftp.sas.com/pub/neural/jargon The neural network (NN) and statistical literatures contain many of the same concepts but usually with different terminology. Sometimes the same term or acronym is used in both literatures but with different meanings. Only in very rare cases is the same term used with the same meaning, although some cross-fertilization is beginning to happen. Below is a list of such corresponding terms or definitions. Particularly loose correspondences are marked by a ~ between the two columns. A < indicates that the term on the left is roughly a subset of the term on the right, and a > indicates the reverse. Terminology in both fields is often vague, so precise equivalences are not always possible. The list starts with some basic definitions. There is disagreement in the NN literature on how to count layers. Some people count inputs as a layer and some don't. I specify the number of hidden layers instead. This is awkward but unambiguous. Definition Statistical Jargon ========== ================== generalizing from noisy data and assessment of the accuracy thereof the set of all cases one wants to be able to generalize to a function of the values in a population, such as the mean or a globally optimal synaptic weight a function of the values in a sample, such as the mean or a learned synaptic weight Statistical inference Population Parameter Statistic Neural Network Jargon Definition ===================== ========== Neuron, neurode, unit, node, processing element Neural networks a simple linear or nonlinear computing element that accepts one or more inputs, computes a function thereof, and may direct the result to one or more other neurons a class of flexible nonlinear regression and discriminant models, data reduction models, and nonlinear dynamical systems consisting of an often large number of neurons interconnected in often complex ways and often organized into layers http://zernike.uwinnipeg.ca/jargon (1 of 5) [2/1/2003 7:17:53 PM]

http://zernike.uwinnipeg.ca/jargon Neural Network Jargon Statistical Jargon ===================== ================== Statistical methods Architecture Training, Learning, Adaptation Classification Mapping, Function approximation Supervised learning Unsupervised learning, Self-organization Competitive learning Hebbian learning, Cottrell/Munro/Zipser technique Training set Test set, Validation set Pattern, Vector, Example, Case Linear regression and discriminant analysis, simulated annealing, random search Model Estimation, Model fitting, Optimization Discriminant analysis Regression Regression, Discriminant analysis Principal components, Cluster analysis, Data reduction Cluster analysis Principal components Sample, Construction sample Hold-out sample Observation, Case Reflectance pattern an observation normalized to sum to 1 Binary(0/1), Bivalent or Bipolar(-1/1) Input Output Forward propagation Training values Target values Training pair Shift register, (Tapped) (time) delay (line), Input window Errors Noise Generalization Binary, Dichotomous Independent variables, Predictors, Regressors, Explanatory variables, Carriers Predicted values Prediction Dependent variables, Responses, Observed values Observation containing both inputs and target values Lagged variable Residuals Error term Interpolation, Extrapolation, Prediction http://zernike.uwinnipeg.ca/jargon (2 of 5) [2/1/2003 7:17:53 PM]

http://zernike.uwinnipeg.ca/jargon Error bars Prediction Adaline (ADAptive LInear NEuron) Confidence interval Forecasting Linear two-group discriminant analysis (not Fisher's but generic) (No-hidden-layer) perceptron ~ Generalized linear model (GLIM) Activation function, Signal function, Transfer function Softmax Squashing function Semilinear function Phi-machine Linear 1-hidden-layer perceptron 1-hidden-layer perceptron Weights, Synaptic weights Bias the difference between the expected value of a statistic and the corresponding true value (parameter) Shortcuts, Jumpers, Bypass connections, direct linear feedthrough (direct connections from input to output) Functional links Second-order network Higher-order network Instar, Outstar > Inverse link function in GLIM Multiple logistic function bounded function with infinite domain differentiable nondecreasing function Linear model Maximum redundancy analysis, Principal components of instrumental variables ~ Projection pursuit regression < (Regression) coefficients, Parameter estimates ~ Intercept Bias ~ Main effects Interaction terms or transformations Quadratic regression, Response-surface model Polynomial regression, Linear model with interaction terms iterative algorithms of doubtful convergence for approximating an arithmetic mean or centroid Delta rule, adaline rule, iterative algorithm of doubtful Widrow-Hoff rule, convergence for training a linear LMS (Least Mean Squares) rule perceptron by least squares, similar to stochastic approximation training by minimizing the median of the squared errors Generalized delta rule LMS (Least Median of Squares) iterative algorithm of doubtful http://zernike.uwinnipeg.ca/jargon (3 of 5) [2/1/2003 7:17:53 PM]

http://zernike.uwinnipeg.ca/jargon Back propagation convergence for training a nonlinear perceptron by least squares, similar to stochastic approximation Computation of derivatives for a multilayer perceptron and various algorithms (such as the generalized delta rule) based thereon Weight decay, Regularization > Shrinkage estimation, Ridge regression Jitter Growing, Pruning, Brain damage, Self-structuring, Ontogeny Optimal brain surgeon LMS (Least mean squares) Relative entropy, Cross entropy Evidence framework random noise added to the inputs to smooth the estimates Subset selection, Model selection, Pre-test estimation Wald test OLS (Ordinary least squares) (see also "LMS rule" above) Kullback-Leibler divergence Empirical Bayes estimation OLS (Orthogonal least squares) Forward stepwise regression Probabilistic neural network General regression neural network Topologically distributed encoding Adaptive vector quantization Kernel discriminant analysis Kernel regression < (Generalized) Additive model iterative algorithms of doubtful convergence for K-means cluster analysis Adaptive Resonance Theory 2a ~ Hartigan's leader algorithm Learning vector quantization Counterpropagation Encoding, Autoassociation Heteroassociation Epoch Continuous training, Incremental training, On-line training, Instantaneous training a form of piecewise linear discriminant analysis using a preliminary cluster analysis Regressogram based on k-means clusters Dimensionality reduction (Independent and dependent variables are the same) Regression, Discriminant analysis (Independent and dependent variables are different) Iteration Iteratively updating estimates one observation at a time via difference equations, as in stochastic approximation http://zernike.uwinnipeg.ca/jargon (4 of 5) [2/1/2003 7:17:53 PM]

http://zernike.uwinnipeg.ca/jargon Batch training, Off-line training Iteratively updating estimates after each complete pass over the data as in most nonlinear regression algorithms http://zernike.uwinnipeg.ca/jargon (5 of 5) [2/1/2003 7:17:53 PM]