CS 730/730W/830: Intro AI

Similar documents
CS 730/730W/830: Intro AI

CS 730/830: Intro AI. 1 handout: slides. Wheeler Ruml (UNH) Lecture 17, CS / 16. Regression POP

ECE521 Lectures 9 Fully Connected Neural Networks

CSC242: Intro to AI. Lecture 21

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Artifical Neural Networks

The 30 Years War. Wheeler Ruml (UNH) Lecture 11, CS / 24. Leibniz. Logic in Practice. Satisfiability ILP

Midterm: CS 6375 Spring 2015 Solutions

CSC321 Lecture 4 The Perceptron Algorithm

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Machine Learning Lecture 10

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Simple Neural Nets For Pattern Classification

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Introduction to Neural Networks

CS 730/830: Intro AI. 1 handout: slides. Wheeler Ruml (UNH) Lecture 11, CS / 15. Propositional Logic. First-Order Logic

Linear Regression (continued)

Holdout and Cross-Validation Methods Overfitting Avoidance

CS534 Machine Learning - Spring Final Exam

Unit III. A Survey of Neural Network Model

Neural Networks: Backpropagation

Lecture 17: Neural Networks and Deep Learning

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Artificial Neural Networks Examination, June 2004

Neural networks. Chapter 19, Sections 1 5 1

Artificial Neural Networks. Historical description

Artificial Neural Networks Examination, June 2005

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Hopfield Neural Network

Linear and Logistic Regression. Dr. Xiaowei Huang

Artificial Neural Networks. MGS Lecture 2

CS 730/730W/830: Intro AI

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Chapter 2 Single Layer Feedforward Networks

Linear Classifiers. Michael Collins. January 18, 2012

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

Neural networks. Chapter 20. Chapter 20 1

Neural Networks DWML, /25

Deep Feedforward Networks

Machine Learning (CS 567) Lecture 2

From perceptrons to word embeddings. Simon Šuster University of Groningen

Artificial Neural Networks Examination, March 2004

CS 188: Artificial Intelligence Spring Announcements

FINAL: CS 6375 (Machine Learning) Fall 2014

Neural Networks: Backpropagation

CS 730/830: Intro AI. 3 handouts: slides, asst 6, asst 7. Wheeler Ruml (UNH) Lecture 12, CS / 16. Reasoning.

Intro to Neural Networks and Deep Learning

Course 395: Machine Learning - Lectures

Artificial neural networks

Multilayer Perceptron

y(x n, w) t n 2. (1)

Security Analytics. Topic 6: Perceptron and Support Vector Machine

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Deep Learning Lab Course 2017 (Deep Learning Practical)

CSCI567 Machine Learning (Fall 2014)

Reading Group on Deep Learning Session 1

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Using a Hopfield Network: A Nuts and Bolts Approach

Neural Nets Supervised learning

Lecture 4: Perceptrons and Multilayer Perceptrons

ADALINE for Pattern Classification

Logistic Regression & Neural Networks

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Training Multi-Layer Neural Networks. - the Back-Propagation Method. (c) Marcin Sydow

Bits of Machine Learning Part 1: Supervised Learning

Artificial Intelligence

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Machine Learning Lecture 12

CS489/698: Intro to ML

Least Mean Squares Regression

Single layer NN. Neuron Model

Machine Learning I Continuous Reinforcement Learning

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

18.6 Regression and Classification with Linear Models

Numerical Learning Algorithms

Machine Learning Practice Page 2 of 2 10/28/13

Grundlagen der Künstlichen Intelligenz

Regression and Classification" with Linear Models" CMPSCI 383 Nov 15, 2011!

Neural Network Training

A summary of Deep Learning without Poor Local Minima

Final Exam, Machine Learning, Spring 2009

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Artificial Neural Network : Training

CE213 Artificial Intelligence Lecture 14

DM534 - Introduction to Computer Science

PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron

LECTURE NOTE #NEW 6 PROF. ALAN YUILLE

Mul7layer Perceptrons

Machine Learning for NLP

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Lecture 7 Artificial neural networks: Supervised learning

1 Machine Learning Concepts (16 points)

Mining Classification Knowledge

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9

Transcription:

CS 730/730W/830: Intro AI 1 handout: slides Wheeler Ruml (UNH) Lecture 20, CS 730 1 / 13

Summary k-nn Wheeler Ruml (UNH) Lecture 20, CS 730 2 / 13

Summary Summary k-nn supervised learning: induction, generalization classification vs regression numeric, ordinal, discrete confidence estimates k-nn : distance function (any attributes), any labels Neural network : numeric attributes, numeric or binary labels Wheeler Ruml (UNH) Lecture 20, CS 730 3 / 13

Using the k-nearest Neighbors Summary k-nn majority. k = 1 gives Voroni cells d(a,b) = (a i b i ) 2 i normalize dimensions (divide by 1 N weight by distance? i (x i x) 2 ) +: robust to noise, choose k by easy cross-validation : memory, kd-tree, irrelevant features, sparse data in high d Wheeler Ruml (UNH) Lecture 20, CS 730 4 / 13

Wheeler Ruml (UNH) Lecture 20, CS 730 5 / 13

numeric attributes, numeric or binary labels units plain single layer: regression single layer with thresholds: Perceptron +: general (can be non-linear) : hard to train, hard to interpret Wheeler Ruml (UNH) Lecture 20, CS 730 6 / 13

Regression ŷ = θ 0 f 0 (x)+θ 1 f 1 (x)+θ 2 f 2 (x)+... Wheeler Ruml (UNH) Lecture 20, CS 730 7 / 13

Regression ŷ = θ 0 f 0 (x)+θ 1 f 1 (x)+θ 2 f 2 (x)+... ŷ = θ 0 +θ 1 x ŷ = θ 0 +θ 1 x+θ 2 x 2 +θ 3 x 3 ŷ = θ 0 +θ 1 sinx How to learn for large data, or on-line? Wheeler Ruml (UNH) Lecture 20, CS 730 7 / 13

On-line Regression y = θf(x) given sample x,y, want update to decrease E = (ŷ y)2 2 : Wheeler Ruml (UNH) Lecture 20, CS 730 8 / 13

On-line Regression y = θf(x) given sample x,y, want update to decrease E = (ŷ y)2 2 : θ i θ i α δe δθ i Wheeler Ruml (UNH) Lecture 20, CS 730 8 / 13

On-line Regression y = θf(x) given sample x,y, want update to decrease E = (ŷ y)2 2 : θ i θ i α δe δθ i δe = δ (ŷ y) 2 δθ i δθ i 2 = (ŷ y) δ (ŷ y) δθ i = (ŷ y) δ δθ i (θf(x) y) = (ŷ y)f i (x) Wheeler Ruml (UNH) Lecture 20, CS 730 8 / 13

On-line Regression y = θf(x) given sample x,y, want update to decrease E = (ŷ y)2 2 : θ i θ i α δe δθ i δe = δ (ŷ y) 2 δθ i δθ i 2 = (ŷ y) δ (ŷ y) δθ i = (ŷ y) δ δθ i (θf(x) y) = (ŷ y)f i (x) θ i θ i α(ŷ y)f i (x) Wheeler Ruml (UNH) Lecture 20, CS 730 8 / 13

The LMS Procedure y = θx θ θ α(ŷ y)x Wheeler Ruml (UNH) Lecture 20, CS 730 9 / 13

The LMS Procedure y = θx θ θ α(ŷ y)x for x = 0,x 1,x 2, the updates are: θ 0 θ 0 α(ŷ y) θ 1 θ 1 α(ŷ y)x 1 θ 2 θ 2 α(ŷ y)x 2 α 1/N? or 100/(100+N)? or 0.1? rule, LMS weight update, Adaline rule, Widrow-Hoff rule, Perceptron rule converges if data are linear perceptron in finite time if linearly separable Wheeler Ruml (UNH) Lecture 20, CS 730 9 / 13

Break asst 4 exam 2: planning, MDPs, RL projects! Wheeler Ruml (UNH) Lecture 20, CS 730 10 / 13

The Three-layer Architecture hidden layer non-linear! training: backwards error propagation recurrence Wheeler Ruml (UNH) Lecture 20, CS 730 11 / 13

Backwards Error Propagation k inputs, j hidden units, i outputs g (in i ) is derivative of activation function wrt input i i = g (in i )(ŷ y) W j,i = W j,i αa j i Wheeler Ruml (UNH) Lecture 20, CS 730 12 / 13

Backwards Error Propagation k inputs, j hidden units, i outputs g (in i ) is derivative of activation function wrt input i i = g (in i )(ŷ y) W j,i = W j,i αa j i j = g (in j ) i W j,i i W k,j = W k,j αa k j only locally optimal, dependence on structure Wheeler Ruml (UNH) Lecture 20, CS 730 12 / 13

EOLQs What question didn t you get to ask today? What s still confusing? What would you like to hear more about? Please write down your most pressing question about AI and put it in the box on your way out. Thanks! Wheeler Ruml (UNH) Lecture 20, CS 730 13 / 13