Numerical Learning Algorithms


 Madeleine Johns
 1 years ago
 Views:
Transcription
1 Numerical Learning Algorithms Example SVM for Separable Examples Example SVM for Nonseparable Examples Example Gaussian Kernel SVM Example Gaussian Kernel, Zoomed In Ensemble Learning 7 Ensemble Learning Boosting Example Boosting Algorithm Example Run of AdaBoost Example Run of AdaBoost, Continued Introduction Naive Bayes Naive Bayes Naive Bayes Example Naive Bayes Example Continued Linear Models 6 Linear Models Example of Numeric Examples Linear Regression Least Squares Gradient Descent Perceptron Learning Rule Perceptrons Continued Example of Perceptron Learning (α = ) The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Neural Networks 4 Artificial Neural Networks ANN Structure ANN Illustration Illustration Sigmoid Activation Plot of Sigmoid Function Backpropagation Applying The Chain Rule Support Vector Machines Support Vector Machines
2 Introduction Numerical learning methods learn the parameters or weights of a model, often by optimizing an error function. Examples include: Calculate the parameters of a probability distribution. Separate positive from negative examples by a decision boundary. Find points close to positive but far from negative examples. Update parameters to decrease error. CS 79 Artificial Intelligence Numerical Learning Algorithms Naive Bayes Naive Bayes For class C and attributes X i, assume: P(C, X,..., X n ) = P(C)P(X C)...P(X n C) This corresponds to a Bayesian network where C is the sole parent of each X i. Estimate prior and conditional probabilities by counting. If an outcome occurs m times out of n trials, Laplace s law of succession recommends the estimate (m + )/(n + k) where k is the number of outcomes. CS 79 Artificial Intelligence Numerical Learning Algorithms Naive Bayes Example Using Laplace s law of succession on the 4 examples.: P(pos) = (9 + )/(4 + ) = /6 P(neg) = (5 + )/(4 + ) = 6/6 P(sunny pos) = ( + )/(9 + ) = / P(overcast pos) = (4 + )/(9 + ) = 5/ P(rain pos) = ( + )/(9 + ) = 4/ Naive Bayes Example Continued For the first example: P(pos sunny, hot, high, false) = α (/6) (/)(/)(4/)(7/) α.94 P(neg sunny, hot, high, false) = α (6/6) (4/8) (/8) (5/7)(/7) α CS 79 Artificial Intelligence Numerical Learning Algorithms 5 Linear Models 6 Linear Models For a linear model, the output and each attribute must be numeric. The input of an example is a numeric vector x = (., x,..., x n ). A hypothesis is a weight vector w = (w o, w,..., w n ). w is the bias weight. The output of a hypothesis is computed by ŷ = w o + w x +...w n x n = w x The loss on example (x, y) is typically one of: Squared error loss: L (y, ŷ) = (y ŷ) Absolute error loss: L (y, ŷ) = y ŷ / loss: L / (y, ŷ) = if y = ŷ else CS 79 Artificial Intelligence Numerical Learning Algorithms 6 CS 79 Artificial Intelligence Numerical Learning Algorithms 4 4
3 Example of Numeric Examples No. Input Attributes Output Sunny Rainy Hot Cool Humid Windy CS 79 Artificial Intelligence Numerical Learning Algorithms 7 Linear Regression Linear regression finds the weights that minimizes loss over the training set. Gradient descent changes the weights based on the gradient, the derivatives of the loss with respect to the weights. (more on next page) The linear least squares algorithm calculates the weights by: w = (X X) X y where X is the data matrix and y is the vector of outputs. Classification can be performed by if w x > then positive else negative CS 79 Artificial Intelligence Numerical Learning Algorithms 8 Least Squares Gradient Descent w zeroes loop until convergence for each example (x j, y j ) ŷ j w x j for each w i in w w i w i + α(y j ŷ j )x ij where α is the learning rate. This is a small number chosen to tradeoff speed of convergence vs. closeness to optimal weights. CS 79 Artificial Intelligence Numerical Learning Algorithms 9 Perceptron Learning Rule [differs from book] A perceptron does gradient descent for absolute error loss (more accurately, ramp loss ). This assumes each y j is or. w zeroes loop until convergence for each example (x j, y j ) ŷ j w x j if (y j = ŷ j <) (y j = ŷ j > ) then for each w i in w w i w i + α y j x ij Again, α is the learning rate. CS 79 Artificial Intelligence Numerical Learning Algorithms Perceptrons Continued The perceptron convergence theorem states that if some w classifies all the training examples correctly, then the perceptron learning rule will converge to zero error on the training examples. Usually, many epochs (passes over the training examples) are needed until convergence. If zero error is not possible, use α./n, where n is the number of normalized or binary inputs. CS 79 Artificial Intelligence Numerical Learning Algorithms 5 6
4 Example of Perceptron Learning (α = ) Using α = : Inputs Weights x x x x 4 y ŷ L w w w w w CS 79 Artificial Intelligence Numerical Learning Algorithms Neural Networks 4 Artificial Neural Networks An (artificial) neural network consists of units, connections, and weights. Inputs and outputs are numeric. Biological NN soma axon, dendrite synapse potential threshold signal Artificial NN unit connection weight weighted sum bias weight activation CS 79 Artificial Intelligence Numerical Learning Algorithms 4 The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm The knearest neighbor algorithm classifies a test example by finding the k closest training example(s), returning the most common class. Suppose % noise (best possible test error is %). With sufficient training exs., a test example will agree with its nearest neighbor with prob. (.9)(.9) + (.)(.) =.8 (both not noisy or both noisy) and disagree with prob. (.9)(.) + (.)(.9) =.8. In general nearest neighbor converges to less than twice the optimal error (NN to less than % higher than optimal). CS 79 Artificial Intelligence Numerical Learning Algorithms ANN Structure A typical unit j receives inputs a, a,... from other units and performs a weighted sum: in j = w j + Σi w ij a i and outputs activation a j = g(in j ). Typically, input units store the inputs, hidden units transform the inputs into an internal numeric vector, and an output unit transforms the hidden values into the prediction. An ANN is a function f(x,w) = a, where x is an example, W is the weights, and a is the prediction (activation value from output unit). Learning is finding a W that minimizes error. CS 79 Artificial Intelligence Numerical Learning Algorithms 5 7 8
5 ANN Illustration INPUT UNITS x x x x 4 w 5 w 5 w 5 w 45 w 6 w 6 w 6 w 46 WEIGHTS HIDDEN UNITS a 5 w 5 + w 6 a 6 w 57 w 7 w 67 OUTPUT UNIT a 7 OUTPUT CS 79 Artificial Intelligence Numerical Learning Algorithms 6 Illustration INPUT UNITS x x x x 4 HIDDEN UNITS a a 6 WEIGHTS 4 OUTPUT UNIT a 7 OUTPUT CS 79 Artificial Intelligence Numerical Learning Algorithms 7 Sigmoid Activation The sigmoid function is defined as: sigmoid(x) = + e x It is commonly used for ANN activation functions: a j = sigmoid(in j ) = sigmoid(w i + Σi w ij a i ) Note that sigmoid(x) = sigmoid(x)( sigmoid(x)) x CS 79 Artificial Intelligence Numerical Learning Algorithms 8 Plot of Sigmoid Function sigmoid(x) CS 79 Artificial Intelligence Numerical Learning Algorithms 9 9
6 Backpropagation One learning method is backpropagating the error from the output to all of the weights. It is an application of the delta rule. Given loss L(W,x, y), obtain the gradient: [ ] L L(W,x, y) =...,,... w ij To decrease error, use the update rule: w ij w ij α L w ij where α is the learning rate. CS 79 Artificial Intelligence Numerical Learning Algorithms Applying The Chain Rule Using L = (y k a k ) for output unit k: L w jk = L a k in k a k in k w jk = (y k a k ) a k ( a k ) a j For weights from input to hidden units: L w ij = L a k in k a j in j a k in k a j in j w ij = (y k a k ) a k ( a k ) w jk a j ( a j ) x i CS 79 Artificial Intelligence Numerical Learning Algorithms Support Vector Machines Support Vector Machines A SVM assigns a weight α i to each example (x i, y i ) (x i is an attribute value vector, y i is either or ). A SVM computes a discriminant by: ( ) h(x) = sign b + Σ α i y i K(x,x i ) i where K is a kernel function. A SVM learns by optimizing the error function: minimize h / + Σ i max(, y i h(x i )) subject to α i C where h is the size of h in kernel space CS 79 Artificial Intelligence Numerical Learning Algorithms Example SVM for Separable Examples w.x + b =  w.x + b = w.x + b = CS 79 Artificial Intelligence Numerical Learning Algorithms
7 Example SVM for Nonseparable Examples Example Gaussian Kernel SVM w.x + b =  w.x + b = w.x + b = CS 79 Artificial Intelligence Numerical Learning Algorithms 5 CS 79 Artificial Intelligence Numerical Learning Algorithms 4 Example Gaussian Kernel, Zoomed In CS 79 Artificial Intelligence Numerical Learning Algorithms 6 4
8 Ensemble Learning 7 Ensemble Learning There are many algorithms for learning a single hypothesis. Ensemble learning will learn and combine a collection of hypotheses by running the algorithm on different training sets. Bagging (briefly mentioned in the book) runs a learning algorithm on repeated subsamples of the training set. If there are n examples, then a subsample of n examples is generated by sampling with replacement. On a test example, each hypothesis casts vote for the class it predicts. CS 79 Artificial Intelligence Numerical Learning Algorithms 7 Boosting In boosting, the hypotheses are learned in sequence. Both hypotheses and examples have weights with different purposes. After each hypothesis is learned, its weight is based on its error rate, and the weights of the training examples (initially all equal) are also modified. On a test example, when each hypothesis predicts a class, its weight is the size of its vote. The ensemble predicts the class with the highest vote. CS 79 Artificial Intelligence Numerical Learning Algorithms 8 Example Run of AdaBoost Using the 4 examples as a training set: The hypothesis windy = false class = pos is wrong on 5 of the 4 examples. The weights of the correctly classified examples are multiplied by 5/9, then all examples are multiplied by 4/ so they sum up to again. This hypothesis has a weight of log(9/5). Note that after weight updating, the sum of the correctly classified examples equals the sum of the incorrectly classified examples. CS 79 Artificial Intelligence Numerical Learning Algorithms Example Run of AdaBoost, Continued The next hypothesis must be different from the previous one to have error less than /. Now the hypothesis outlook = overcast class = pos has an error rate of 9/9. The weights of the correctly classified examples are multiplied times 9/6.475, then all examples are multiplied by 9/58.55 so they sum up to again. This hypothesis has a weight of log(6/9). CS 79 Artificial Intelligence Numerical Learning Algorithms Example Boosting Algorithm AdaBoost(examples, algorithm, iterations). n number of examples. initialize weights w[... n] to /n. for i from to iterations 4. h[i] algorithm(examples) 5. error sum of exs. misclassfied by h[i] 6. for j from to n 7. if h[i] is correct on example j 8. then w[j] w[j] error/( error) 9. normalize w[...n] so it sums to. weight of h[i] log(( error)/error). return h[... iterations] and their weights CS 79 Artificial Intelligence Numerical Learning Algorithms 9 5 6
Artifical Neural Networks
Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification
More informationArtificial neural networks
Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Nonparametric
More informationMachine Learning (CSE 446): Neural Networks
Machine Learning (CSE 446): Neural Networks Noah Smith c 2017 University of Washington nasmith@cs.washington.edu November 6, 2017 1 / 22 Admin No Wednesday office hours for Noah; no lecture Friday. 2 /
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2wide by 4high landscape pages
More informationCS7267 MACHINE LEARNING
CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo GutierrezOsuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationFinal Examination CS 5402: Introduction to Artificial Intelligence
Final Examination CS 5402: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11
More informationRevision: Neural Network
Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationMIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE
MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationDecision Trees. Data Science: Jordan BoydGraber University of Maryland MARCH 11, Data Science: Jordan BoydGraber UMD Decision Trees 1 / 1
Decision Trees Data Science: Jordan BoydGraber University of Maryland MARCH 11, 2018 Data Science: Jordan BoydGraber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last
More informationStochastic Gradient Descent
Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationHoldout and CrossValidation Methods Overfitting Avoidance
Holdout and CrossValidation Methods Overfitting Avoidance Decision Trees Reduce error pruning Costcomplexity pruning Neural Networks Early stopping Adjusting Regularizers via CrossValidation Nearest
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationAlgorithms for Classification: The Basic Methods
Algorithms for Classification: The Basic Methods Outline Simplicity first: 1R Naïve Bayes 2 Classification Task: Given a set of preclassified examples, build a model or classifier to classify new cases.
More informationSPSS, University of Texas at Arlington. Topics in Machine LearningEE 5359 Neural Networks
Topics in Machine LearningEE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps Ddimensional vectors to real numbers. For notational convenience, we add a zeroth dimension
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationArtificial Neural Network
Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts
More informationThe exam is closed book, closed notes except your onepage (two sides) or twopage (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your onepage (two sides) or twopage (one side) crib sheet. Please
More informationAE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p)
1 Decision Trees (13 pts) Data points are: Negative: (1, 0) (2, 1) (2, 2) Positive: (0, 0) (1, 0) Construct a decision tree using the algorithm described in the notes for the data above. 1. Show the
More informationB555  Machine Learning  Homework 4. Enrique Areyan April 28, 2015
 Machine Learning  Homework Enrique Areyan April 8, 01 Problem 1: Give decision trees to represent the following oolean functions a) A b) A C c) Ā d) A C D e) A C D where Ā is a negation of A and is
More informationSupervised Learning (contd) Decision Trees. Mausam (based on slides by UWAI faculty)
Supervised Learning (contd) Decision Trees Mausam (based on slides by UWAI faculty) Decision Trees To play or not to play? http://www.sfgate.com/blogs/images/sfgate/sgreen/2007/09/05/2240773250x321.jpg
More informationClassification CE717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements FeedForward Networks Perceptrons (Singlelayer,
More informationMidterm: CS 6375 Spring 2018
Midterm: CS 6375 Spring 2018 The exam is closed book (1 cheat sheet allowed). Answer the questions in the spaces provided on the question sheets. If you run out of room for an answer, use an additional
More informationData Mining und Maschinelles Lernen
Data Mining und Maschinelles Lernen Ensemble Methods BiasVariance Tradeoff Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking ErrorCorrecting
More informationMachine Learning. Ensemble Methods. Manfred Huber
Machine Learning Ensemble Methods Manfred Huber 2015 1 Bias, Variance, Noise Classification errors have different sources Choice of hypothesis space and algorithm Training set Noise in the data The expected
More informationEnsemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12
Ensemble Methods Charles Sutton Data Mining and Exploration Spring 2012 Bias and Variance Consider a regression problem Y = f(x)+ N(0, 2 ) With an estimate regression function ˆf, e.g., ˆf(x) =w > x Suppose
More informationNeural Networks and Ensemble Methods for Classification
Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationLINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning
LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In KNN we saw an example of a nonlinear classifier: the decision boundary
More informationBayesian Learning. Bayesian Learning Criteria
Bayesian Learning In Bayesian learning, we are interested in the probability of a hypothesis h given the dataset D. By Bayes theorem: P (h D) = P (D h)p (h) P (D) Other useful formulas to remember are:
More informationLinear discriminant functions
Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative
More informationInteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano
Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano Prof. Josenildo Silva jcsilva@ifma.edu.br 2015 20122015 Josenildo Silva (jcsilva@ifma.edu.br) Este material é derivado dos
More information2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller
2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that
More informationFinal Exam, Fall 2002
15781 Final Exam, Fall 22 1. Write your name and your andrew email address below. Name: Andrew ID: 2. There should be 17 pages in this exam (excluding this cover sheet). 3. If you need more room to work
More informationMIRA, SVM, knn. Lirong Xia
MIRA, SVM, knn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output
More informationArtificial Intelligence Roman Barták
Artificial Intelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Introduction We will describe agents that can improve their behavior through diligent study of their
More informationCS325 Artificial Intelligence Chs. 18 & 4 Supervised Machine Learning (cont)
CS325 Artificial Intelligence Cengiz Spring 2013 Model Complexity in Learning f(x) x Model Complexity in Learning f(x) x Let s start with the linear case... Linear Regression Linear Regression price =
More information22c145Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Faulttolerant Reliable
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwthaachen.de leibe@vision.rwthaachen.de Course Outline Fundamentals Bayes Decision Theory
More informationCSCE 478/878 Lecture 6: Bayesian Learning
Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell
More informationMachine Learning. YuhJye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU
Machine Learning YuhJye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU March 1, 2017 1 / 13 Bayes Rule Bayes Rule Assume that {B 1, B 2,..., B k } is a partition of S
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationStatistical Machine Learning from Data
January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data MultiLayer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole
More informationMachine Learning Algorithm. Heejun Kim
Machine Learning Algorithm Heejun Kim June 12, 2018 Machine Learning Algorithms Machine Learning algorithm: a procedure in developing computer programs that improve their performance with experience. Types
More informationThe Naïve Bayes Classifier. Machine Learning Fall 2017
The Naïve Bayes Classifier Machine Learning Fall 2017 1 Today s lecture The naïve Bayes Classifier Learning the naïve Bayes Classifier Practical concerns 2 Today s lecture The naïve Bayes Classifier Learning
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multilayer networks Backpropagation Hidden layer representations Examples
More informationCS534 Machine Learning  Spring Final Exam
CS534 Machine Learning  Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationClassification. Classification. What is classification. Simple methods for classification. Classification by decision tree induction
Classification What is classification Classification Simple methods for classification Classification by decision tree induction Classification evaluation Classification in Large Databases Classification
More informationSections 18.6 and 18.7 Artificial Neural Networks
Sections 18.6 and 18.7 Artificial Neural Networks CS4811  Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs artifical neural networks
More informationNeural Networks: Introduction
Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai ShalevShwartz and Shai BenDavid, and others 1
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationPattern Recognition and Machine Learning. Perceptrons and Support Vector machines
Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3  MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationAI Programming CS F20 Neural Networks
AI Programming CS6622008F20 Neural Networks David Galles Department of Computer Science University of San Francisco 200: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 MultiLayer Perceptrons The BackPropagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationSections 18.6 and 18.7 Artificial Neural Networks
Sections 18.6 and 18.7 Artificial Neural Networks CS4811  Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs. artifical neural
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More informationBayesian Learning. Artificial Intelligence Programming. 150: Learning vs. Deduction
150: Learning vs. Deduction Artificial Intelligence Programming Bayesian Learning Chris Brooks Department of Computer Science University of San Francisco So far, we ve seen two types of reasoning: Deductive
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your onepage crib sheet. No calculators or electronic items.
More informationNeural Networks DWML, /25
DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 40 5 Scene recognition time 0. sec 00 inference
More informationArtificial Neural Networks Examination, March 2004
Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationVBM683 Machine Learning
VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data
More informationArtificial Neural Network
Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation
More informationCSCI567: Machine Learning (Spring 2019)
CSCI567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationFeedforward Neural Nets and Backpropagation
Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features
More information18.9 SUPPORT VECTOR MACHINES
744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the
More informationHierarchical Boosting and Filter Generation
January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW1 due next lecture Project details are available decide on the group and topic by Thursday Last time Generative vs. Discriminative
More informationNeural Networks. Singlelayer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Singlelayer neural network 3/9/7 Perceptron as a neural
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II  Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More informationMachine Learning and Data Mining. Multilayer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multilayer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationARTIFICIAL INTELLIGENCE. Artificial Neural Networks
INFOB2KI 20172018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationBayesian Learning. Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 12.
Bayesian Learning Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 12. (Linked from class website) Conditional Probability Probability of
More informationFINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE
FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a twopage cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.
More informationFinal Examination CS5402: Introduction to Artificial Intelligence
Final Examination CS5402: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your
More informationPart I Week 7 Based in part on slides from textbook, slides of Susan Holmes
Part I Week 7 Based in part on slides from textbook, slides of Susan Holmes Support Vector Machine, Random Forests, Boosting December 2, 2012 1 / 1 2 / 1 Neural networks Artificial Neural networks: Networks
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationCS 188: Artificial Intelligence. Outline
CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multiclass
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More information