Neural Network Models in Statistical Learning

Size: px
Start display at page:

Download "Neural Network Models in Statistical Learning"

Transcription

1 Neural Network Models in Statistical Learning Stephen Talley April 25, 2014 Abstract Neural network models can solve problems more easily than traditional methods by emulating the human brain. We examine a basic neural network to model regression and to classify data. We conclude with an example of basic ZIP code character recognition. 1 Introduction 1.1 Definition Neural network models were originally developed in two separate yet equally important fields: statistics and artificial intelligence [1]. However, despite the connotations that the term neural network carries, there is nothing highly technical or mysterious about such a model. Rather, a neural network is defined by the following: Definition 1. A neural network is a nonlinear statistical model that emulates the human brain on a very basic level by adapting to or learning from a set of training patterns [1, 2]. Because a neural network requires a set of training patterns and targets to properly function, it may be characterized as a supervised system, as opposed to an unsupervised system which infers trends in random, unmarked data. Definition 2. A supervised system is a system or algorithm that infers trends from objective training data. 1.2 History Hebb s Rule The origins of today s neural network models can be traced back to one man s contribution. Dr. Donald Hebb, widely regarded as the father of neuropsychology, outlined an intial theory of biological neural networking in his seminal work The Organization of Behavior (1949) [3]. Theorem 1. Hebb s Rule: When a neuron of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A s efficiency, as one of the cells firing B, is increased. 1

2 Put simply, Cells that fire together, wire together. We can also express a simplified version of Hebb s Rule mathematically: α ij = ηx i x j. (1) In Equation (1), α ij is the change in connection strength between two given nodes i and j and η is a constant learning rate such that 0 η 1. This neurological rule not only proposed an explanation for associative learning in humans, but also provided the basis for adaptive learning algorithms in computer science Early Devlopments While the advent of Hebb s Rule is considered the beginning of computational neuroscience, in truth, the first neural network model had already been created six years before [4]. In 1943, Walter Pitts and Warren McCulloch created the first computational neural network using basic algorithms. Unfortunately, because of their model s simplicity, it was only capable of solving simple arithmetic and logic problems [4]. In 1958, Frank Rosenblatt developed the first successful neurocomputer: a single layer neural network or perceptron model called single layer since it only had one hidden layer between input and output which could receive multiple inputs and create a single output from a linear combination of these inputs [2]. The single layer perceptron, shown in Figure 1, was more adaptable than other models at the time and could solve problems more quickly and reliably despite its simplicity [8]. Figure 1: Diagram of a single layer perceptron neural network. Despite the progress of the perceptron model, it suffered from two limitations. First, the perceptron model could not solve the exclusive-or problem a logical operation that outputs true even when inputs differ on truth value [2]. Second, as problems increased in complexity, progressively more inputs were required for classifications, and the computer hardware of the time was simply too limited to handle these problems. Most further advancement in the field stagnated until technology could reach the perceptron model s computational demands [1]. 2

3 Figure 2: Plot for the general sigmoid function [6] Recent Developments Computer capabilities did not reach the level required for more complex neural network models until the early 80 s, and interest revived in the field soon after [1]. In particular, the discovery of the back-propogation algorithm in 1986 was crucial for further developments, since it helped to find global minimums for error functions in any neural network model [4]. Since this time, researchers have found many new applications for neural network models, including mathematical finance, data mining, handwriting recognition and (obviously) modeling biological neural systems [1]. 1.3 Basics of a Neural Network Model All neural network models, regardless of application, share some common elements, though the number and complexity of these elements can vary depending on the model used [5]. For a basic model, as shown in Figure 1, each red node represents an individual input x i in the vector of p inputs X T = [x 1, x 2, x 3,..., x p ]. These inputs all form a layer unto themselves, simply called the input layer [2]. Each input is connected to the nodes in the second, hidden layer, and these connections all have values associated with them, called weights. Each weight is assigned a random value between 0 and 1 depending on the context of the problem. Then, by using the inputs and weights, the model determines the value of the hidden layer node Z m by forming the linear combination p αmix T = α m1 x 1 + α m2 x α mp x p. i=1 Once the value for the linear combination is found, it is then inserted into a nonlinear activation function σ. Usually, this nonlinear function is the sigmoid function 1 σ(x) =. (2) 1 + e x The sigmoid function is frequently used particularly for regression models because it combines nearly linear, curvilinear, and nearly constant behavior depending on input value [5]. As Figure 2 illustrates, the sigmoid function becomes nearly linear for domain values 1 < x < 1. For extreme values of x, σ(x) becomes nearly constant. 3

4 1.4 Applications Because of the neural network model s ability to generalize a linear model using a nonlinear function along with its ability to learn from data, they can be used for a variety of practical applications. In particular, neural networks are best used for four types of problems [7]: 1. function prediction or approximation, 2. complex data classification (with nonlinear classification boundaries), 3. using internal properties of data for clustering, and 4. time-series forecasting. 1.5 Advantages and Disadvantages of a Neural Network Model The neural network model offers a few distinct advantages over other types of machine learning algorithms. Because a neural network is a supervised system (i.e. it requires a standard or basis for classification), it requires less formal training to determine a proper algorithm for a given data set [5]. Furthermore, neural networks can detect more complex relationships and interactions among variables thanks to their aforementioned property of deriving parameters from data [8]. One last advantage of the neural network is the ubiquity of training algorithms for working with data, most likely stemming from their variety of applications. Unfortunately, neural networks also have several disadvantages. Though computer technology has advanced substantially since the neural network s introduction, more complex models still have heavy computational demands that sometimes cannot be met within a reasonable time. Another disadvantage involves the sheer quantity of connections/weights. Since almost every node is connected to one another, forming a weight for each connection, overfitting data can be an issue; however, this problem can be regulated either by early stopping or by a process called weight decay using a penalty function to shrink all weights toward zero, thereby reducing the model to a linear one [1]. 2 Body 2.1 Advanced Neural Networks Obviously, with more advanced computers come more advanced neural networks. Since the transformation functions of the hidden layers are fairly simple, a typical neural network model can, in truth, have up to 100 nodes encompassing multiple hidden layers [1]. In this case, the formula for determining the outputs becomes a multi-step transformation: Z m = σ(α 0m + α T mx), where X = [x 1, x 2,..., x p ], T k = β 0 + β T k Z, f k (X) = g k (T ), where T = [T 1, T 2,..., T k ]. (3) 4

5 Typically, the complexity of these neural network models is dependent upon the following variables: p, the number of inputs, m, the total number of neurons, and k, the number of classes or outputs. Each step of this algorithm alternates the linearity of the data. Initially, the neural network forms linear combinations from the original inputs. Then, the linear combination is plugged into the activation function σ. Unlike the single-layer perceptron model, a multi-layer network makes an additional linear combination T k from the non-linear hidden layer values Z m and subsequently inputs said linear combination into another, different non-linear function g k (T ). Note that g k (T ) in Equation (3) is an additional, often final activation function brought about by the inclusion of multiple hidden layers. In some of the earliest multi-layer neural network models (and in some current regression models), g k (T ) = T k ; thus, the entire model reduced to a linear output [5]. Classification models later replaced the identity function with the softmax function g k (T ) = e T k K. (4) l=1 et l The softmax function (Equation (4)) was chosen due to its probabalistic properties: each output is between zero and one, and all outputs sum to one [7]. 2.2 Overparameterization and Prevention The Weight Problem Because the scale of the neural network model is dependent on both the number of neurons and the number of inputs, the quantity of connections increases as these two variables increase. These weights are designated by two key parameters, α and β, the complete set of which are given by the matrices below [1]: α 01 α 11 α α p1 α 02 α 12 α α p2 α 03 α 13 α α p α 0m α 1m α 2m... α pm β 01 β 11 β β m1 β 02 β 12 β β m2 β 03 β 13 β β m β 0k β 1k β 2k... β mk Even if errors are minimized, the neural network may overfit the data due to the sheer quantity of weights accounted for in the algorithm [1]. An overfitted model will become excessively complex, and often it will exaggerate minor or random errors in the data. The best and most efficient way to prevent overfitting is by establishing an early stopping rule [5]. An early stopping rule is a method of training the model only for a short time thereby generating fewer weights than would be generated with a full network. This simplifies the model while limiting the potential effect of random error.. 5

6 2.2.2 Error Functions and Minima Aside from the problem of having too many weights, a neural network may also have problems associated with the weights values. Consequently, we must adjust the values for the initially random weights such that they fit the data well enough to make predictions [1]. For regression models, we use a sum-of-squares as our error function R(θ) = K k=1 i=1 N (y ik f k (x i )) 2. (5) Note that R(θ) measures the total difference between the actual class or value and the predicted class or value across all classes K and across all observations N. For a classification neural network, we can also use a cross-entropy equation R(θ) = N i=1 k=1 K y ik log f k (x i ) to determine the minimum amount of information needed for categorizing a given observation [5] Weight Decay While the aforementioned early-stopping technique can be effective for controlling the number of weights, there exists a more explicit method for controlling the quality of weights rather than the quantity: a process known as weight decay [1]. By adding an additional term to the error function, the error equation becomes R(θ) + λj(θ), where J(θ) = km β 2 km + ml α 2 ml and λ 0 represents a tuning parameter [7]. This tuning parameter is ideally large, and the larger the value of λ, the more quickly the weights will shrink to 0. As the weights shrink to 0, the activation (sigmoid) function and by extension the entire model reduces to an approximately linear function. The value of λ is also generally estimated using a cross-validation function [7]. Weight decay is especially important as it helps to improve prediction on any type of neural network [1]. 2.3 Back-propagation Regardless of the equation used for R(θ), it is an error term; therefore, we want to keep the value of R(θ) small. In neural network design, the most popular method for minimizing R(θ) is through back propagation (also called gradient descent) [7]. Quite simply, back-propagation is the process of working backwards from an estimated point using a function s rate of change. Once we have the rate of change and the estimated point, we estimate another, lower point on the function until we reach a minimum. While the network is training with this algorithm, its weights are continually modified to reduce mean-squared error across all classes and observations [5]. The back-propagation method can be 6

7 Figure 3: Examples of handwritten characters from training data [1]. applied for either single or multivariate functions. In this case we only have two parameters that we have any degree of control over, α and β. Using Equation 5 as our error function, we obtain the beta and alpha derivatives R α ml = R β km = 2(y ik f k (x i )g k(β T k z i )z mi. K 2(y ik f k (x i )g k(β k T z i )β km σ (αmx T i )x il. k=1 Once the rates of change are determined for the error function, a gradient descent update for the (r + 1)st iteration takes the form β (r+1) km α (r+1) ml N = β(r) km γ R, β km i=1 N = α (r) ml γ R. α ml The gamma term in both equations denotes the step size for the backpropagation, and it is an arbitrary constant such that 0 γ 1. The actual value for the step size should be chosen carefully, as problems may arise if γ is either too large or too small. If the step size is too large, the algorithm may overstep the local minimum and come up with a larger, inaccurate result. If the step size is too small, the algorithm will take some time to reach the local minimum, sacrificing efficiency in the process. Due to its simplicity, back-propagation is considered the textbook approach to minimizing error; however, there are other methods that can converge to minima more quickly [1]. Use of Newton s method for optimization is possible, but because the second derivative for both parameters can be very complex, it is avoided. One more efficient method is a variation of traditional backpropagation, called conjugate gradient back-propagation. Conjugate gradient back-propagation is similar to back-propagation, but rather than using the negative gradient for steepest descent, the algorithm uses a line search in conjugate directions for alternate directions of descent [7]. While this method tends to be faster, it also is more computationally demanding because of the required searching at each step. i=1 7

8 2.4 Example: ZIP Code Character Recognition The Setup One of the earliest, best-known problems in neural networks has been handwritten character recognition. Because recognizing characters is essentially a classification task (into categories A-Z for letters and 0-9 for numbers), it is an ideal test of a neural network s capabilities. The particular data set for this example is the same used in a similar neural network test in 1989 [9]; however, this example obviously uses more advanced neural networks and error reduction techniques than the previous experiment. Every digit was scanned from U.S. Postal Service envelopes and then standardized into pixel grayscale images such as those in Figure 3. The digits were standardized this way to limit certain characteristics (such as the slant or rotation of the number) which could lead to misclassification [1]. Since the digits were 16 16, each digit, denoted as an observation, had 256 inputs The Procedure Since this example uses the same data as the 1989 neural network test, the total data set consisted of 9298 handwritten digits, each one an individual observation. This data was divided into two main subsets: a practice set of 7291 observation and a working set of 2007 observations [9]. The practice subset was further divided into randomly assigned training, validation, and test sets to prepare the neural network. Both the inputs and the actual targets (the ground truth of which input belongs to which class) were inserted into a MATLAB program which then constructed the neural network. The MATLAB program relies on user input for only two parts of a singlelayer neural network model: the number of neurons in the hidden layer and the allocation of the training data. Given the subdivisions of the training data mentioned before, we determined that an allocation of 90% training, 5% validation, and 5% testing for the 7291 observations in the practice data set yielded the best performance for a single-layer network. While using the entirety of the practice set for training would have likely been ideal, restrictions in MATLAB s neural network scripts prevented us from doing so. The primary reason that this allocation was so effective was because of the high number of observations reserved strictly for training i.e. preparing the neural network. Since the network itself was actually being prepared for the working data set, the testing and validation portions could be relatively low The Results Once we determined the best allocation to use, the next part to consider was the size of the neural network or more specifically, the number of neurons in the hidden layer. For the sake of simplicity, we investigated varying network sizes (multiples of 10 neurons) for accuracy. After running each network five times each, an average accuracy rating was taken, and the initial findings are given in Table 1. Note that ultimately a 60-layer network was the most accurate by a slight margin. While neural networks with more than 100 neurons would possibly be more accurate, these would take a considerable amount of time to compute for 8

9 T raining% V alidation% T est% #of N eurons Accuracy % Table 1: Data for initial neural network tests. each iteration of the network. Once we determined the most accurate allocation and network size, the network was ready for the data. Because MATLAB uses different initial conditions for each neural network test, the best way to gather information was to run the test several times; thus, we decided on trying the network one-hundred times. This reduced the actual test to setting up a for loop to evaluate the working data set a full one-hundred times. The program would then plot a histogram for the classification error and display both the mean and the standard deviation for said error. Referring back to the issue of time, as an example this particular loop required approximately two hours to complete all one-hundred trials. The resulting histogram, shown in Figure 4, showed some interesting results. For one, the error distribution was very right-skewed with only one true outlier of 27% error. Furthermore, both the average error and the standard deviation were far smaller than the findings in Table 1 would have indicated. For this histogram, the mean error = (meaning that the network had 98.42% accuracy) and the standard deviation = Though these statistics may seem high compared to expectations, they are, in comparison with other modern neural networks, relatively low. For example, as of 2011, multi-layer networks have reported error rates as low as 0.7% [1]. 3 Conclusion Overall, neural network models are incredibly useful and versatile statistical learning tools. This paper only examines the basics of the models themselves, error detection, and possible applications that such models can have. Though other applications, such as regression modeling or time series analysis, along with more thorough multi-layer networks, may be examined at a later date, everything in this paper should be sufficient information to give one a proper overview of this fascinating subject. 9

10 Figure 4: Histogram of classification error on working set. References [1] T. Hastie, R. Tibshirani, J. Friedman, Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer: New York, [2] K. Gurney, An Introduction to Neural Networks, UCL Press: London, 1997 [3] D. Hebb, The Organization of Behavior, Wiley and Sons: New York, 1949 [4] I.A. Bansheer, M. Hajmeer, Artifical Neural Networks: Fundamentals, Computing, Design, and Application, Journal of Microbiological Methods, 43 issue 1, 2000, pp [5] J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques, Third Edition, Burlington, Massachusetts: Morgan Kaufman, [6] Image found at svg [7] S. Samarasinghe, Neural Networks for Applied Sciences and Engineering, Auerbach: Boca Raton, Florida, [8] J.V. Tu, Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes, 49 issue 11, 1996, pp [9] Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, Back-Propagation Applied to Handwritten ZIP Code Recognition, Neural Computation, 1 (1989) pp

Computational statistics

Computational statistics Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial

More information

Neural networks (NN) 1

Neural networks (NN) 1 Neural networks (NN) 1 Hedibert F. Lopes Insper Institute of Education and Research São Paulo, Brazil 1 Slides based on Chapter 11 of Hastie, Tibshirani and Friedman s book The Elements of Statistical

More information

Neural Networks. Haiming Zhou. Division of Statistics Northern Illinois University.

Neural Networks. Haiming Zhou. Division of Statistics Northern Illinois University. Neural Networks Haiming Zhou Division of Statistics Northern Illinois University zhouh@niu.edu Neural Networks The term neural network has evolved to encompass a large class of models and learning methods.

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Introduction To Artificial Neural Networks

Introduction To Artificial Neural Networks Introduction To Artificial Neural Networks Machine Learning Supervised circle square circle square Unsupervised group these into two categories Supervised Machine Learning Supervised Machine Learning Supervised

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

The Perceptron. Volker Tresp Summer 2016

The Perceptron. Volker Tresp Summer 2016 The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Advanced statistical methods for data analysis Lecture 2

Advanced statistical methods for data analysis Lecture 2 Advanced statistical methods for data analysis Lecture 2 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Multilayer Perceptron = FeedForward Neural Network

Multilayer Perceptron = FeedForward Neural Network Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification

More information

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost

More information

Machine Learning Linear Models

Machine Learning Linear Models Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

The Perceptron. Volker Tresp Summer 2014

The Perceptron. Volker Tresp Summer 2014 The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

More information

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

CSC Neural Networks. Perceptron Learning Rule

CSC Neural Networks. Perceptron Learning Rule CSC 302 1.5 Neural Networks Perceptron Learning Rule 1 Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs artifical neural networks

More information

Inf2b Learning and Data

Inf2b Learning and Data Inf2b Learning and Data Lecture : Single layer Neural Networks () (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Neural Networks DWML, /25

Neural Networks DWML, /25 DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4 Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs. artifical neural

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

CMSC 421: Neural Computation. Applications of Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

CSC 411 Lecture 10: Neural Networks

CSC 411 Lecture 10: Neural Networks CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

The Perceptron. Volker Tresp Summer 2018

The Perceptron. Volker Tresp Summer 2018 The Perceptron Volker Tresp Summer 2018 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Neural Networks: Introduction

Neural Networks: Introduction Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

More information

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene

More information

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge

More information

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 3. Improving Neural Networks (Some figures adapted from NNDL book) 1 Various Approaches to Improve Neural Networks 1. Cost functions Quadratic Cross

More information

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks Sections 18.6 and 18.7 Analysis of Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline Univariate regression

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Analysis of Fast Input Selection: Application in Time Series Prediction

Analysis of Fast Input Selection: Application in Time Series Prediction Analysis of Fast Input Selection: Application in Time Series Prediction Jarkko Tikka, Amaury Lendasse, and Jaakko Hollmén Helsinki University of Technology, Laboratory of Computer and Information Science,

More information

Topic 3: Neural Networks

Topic 3: Neural Networks CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Pairwise Neural Network Classifiers with Probabilistic Outputs

Pairwise Neural Network Classifiers with Probabilistic Outputs NEURAL INFORMATION PROCESSING SYSTEMS vol. 7, 1994 Pairwise Neural Network Classifiers with Probabilistic Outputs David Price A2iA and ESPCI 3 Rue de l'arrivée, BP 59 75749 Paris Cedex 15, France a2ia@dialup.francenet.fr

More information

Notes on Back Propagation in 4 Lines

Notes on Back Propagation in 4 Lines Notes on Back Propagation in 4 Lines Lili Mou moull12@sei.pku.edu.cn March, 2015 Congratulations! You are reading the clearest explanation of forward and backward propagation I have ever seen. In this

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction

More information

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Alexandre Allauzen Anne Auger Michèle Sebag LIMSI LRI Oct. 4th, 2012 This course Bio-inspired algorithms Classical Neural Nets History

More information

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function

More information

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ

More information

Incremental Stochastic Gradient Descent

Incremental Stochastic Gradient Descent Incremental Stochastic Gradient Descent Batch mode : gradient descent w=w - η E D [w] over the entire data D E D [w]=1/2σ d (t d -o d ) 2 Incremental mode: gradient descent w=w - η E d [w] over individual

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron

More information

Algorithms for Learning Good Step Sizes

Algorithms for Learning Good Step Sizes 1 Algorithms for Learning Good Step Sizes Brian Zhang (bhz) and Manikant Tiwari (manikant) with the guidance of Prof. Tim Roughgarden I. MOTIVATION AND PREVIOUS WORK Many common algorithms in machine learning,

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

CSC242: Intro to AI. Lecture 21

CSC242: Intro to AI. Lecture 21 CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages

More information

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir Supervised (BPL) verses Hybrid (RBF) Learning By: Shahed Shahir 1 Outline I. Introduction II. Supervised Learning III. Hybrid Learning IV. BPL Verses RBF V. Supervised verses Hybrid learning VI. Conclusion

More information

An artificial neural networks (ANNs) model is a functional abstraction of the

An artificial neural networks (ANNs) model is a functional abstraction of the CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Chapter 9: The Perceptron

Chapter 9: The Perceptron Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed

More information

Statistical NLP for the Web

Statistical NLP for the Web Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks

More information