Radial Basis Function Networks

Similar documents
Support vector machines II

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

Binary classification: Support Vector Machines

Support vector machines

Generative classification models

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

An Introduction to. Support Vector Machine

L5 Polynomial / Spline Curves

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Kernel-based Methods and Support Vector Machines

Given a table of data poins of an unknown or complicated function f : we want to find a (simpler) function p s.t. px (

Regression and the LMS Algorithm

CSE 5526: Introduction to Neural Networks Linear Regression

Supervised learning: Linear regression Logistic regression

Dimensionality reduction Feature selection

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

ENGI 3423 Simple Linear Regression Page 12-01

Unsupervised Learning and Other Neural Networks

Bayes (Naïve or not) Classifiers: Generative Approach

Chapter 5. Curve fitting

CS5620 Intro to Computer Graphics

Multiple Choice Test. Chapter Adequacy of Models for Regression

Lecture 5: Interpolation. Polynomial interpolation Rational approximation

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Chapter 9 Jordan Block Matrices

Simple Linear Regression

Introduction to local (nonparametric) density estimation. methods

Chapter 14 Logistic Regression Models

Exercises for Square-Congruence Modulo n ver 11

Taylor s Series and Interpolation. Interpolation & Curve-fitting. CIS Interpolation. Basic Scenario. Taylor Series interpolates at a specific

Lecture 8: Linear Regression

Correlation and Regression Analysis

MATH 247/Winter Notes on the adjoint and on normal operators.

Econometric Methods. Review of Estimation

15-381: Artificial Intelligence. Regression and neural networks (NN)

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

QR Factorization and Singular Value Decomposition COS 323

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Dimensionality Reduction and Learning

3D Geometry for Computer Graphics. Lesson 2: PCA & SVD

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Lecture 12 APPROXIMATION OF FIRST ORDER DERIVATIVES

Objectives of Multiple Regression

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Lecture 3 Probability review (cont d)

Chapter 13 Student Lecture Notes 13-1

MMJ 1113 FINITE ELEMENT METHOD Introduction to PART I

CHAPTER VI Statistical Analysis of Experimental Data

8.1 Hashing Algorithms

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

CS 2750 Machine Learning. Lecture 7. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Generalized Linear Regression with Regularization

6. Nonparametric techniques

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Functions of Random Variables

Statistics: Unlocking the Power of Data Lock 5

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Model Fitting, RANSAC. Jana Kosecka

Summary of the lecture in Biostatistics

Nonparametric Density Estimation Intro

13. Artificial Neural Networks for Function Approximation

α1 α2 Simplex and Rectangle Elements Multi-index Notation of polynomials of degree Definition: The set P k will be the set of all functions:

III-16 G. Brief Review of Grand Orthogonality Theorem and impact on Representations (Γ i ) l i = h n = number of irreducible representations.

Classification : Logistic regression. Generative classification model.

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

Analysis of Lagrange Interpolation Formula

A conic cutting surface method for linear-quadraticsemidefinite

Linear regression (cont.) Linear methods for classification

New Schedule. Dec. 8 same same same Oct. 21. ^2 weeks ^1 week ^1 week. Pattern Recognition for Vision

PTAS for Bin-Packing

Nonparametric Techniques

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

is the score of the 1 st student, x

Line Fitting and Regression

Lattices. Mathematical background

Data Processing Techniques

1 Lyapunov Stability Theory

Point Estimation: definition of estimators

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

n -dimensional vectors follow naturally from the one

Statistics MINITAB - Lab 5

Naïve Bayes MIT Course Notes Cynthia Rudin

Computer Graphics. Geometric Modeling. Geometric Modeling. Page. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion

Continuous Distributions

Lecture 2: Linear Least Squares Regression

LINEAR REGRESSION ANALYSIS

Special Instructions / Useful Data

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Transcription:

Radal Bass Fucto Netorks

Radal Bass Fucto Netorks A specal types of ANN that have three layers Iput layer Hdde layer Output layer Mappg from put to hdde layer s olear Mappg from hdde to output layer s lear PR ANN & ML

Comparso Mult-layer perceptro Multple hdde layers Nolear mappg W: er product lobal mappg Warp classfers Stochastc appromato RBF Netorks Sgle hdde layer Nolear + lear W: dstace Local mappg Warp data Curve fttg PR ANN & ML 3

Aother Ve: Curve Fttg We try to estmate a mappg from patters to classes fpatters->classes f f->d Patters are represeted as feature vector Classes are decsos d rag samples: f ->d =... Iterpolato of the f based o samples d PR ANN & ML 4

Yet Aother Ve: Warpg Data If the problem s ot learly separable MLP ll use multple euros to defe complcated decso boudares arp classfers Aother alteratve s to arp data to hgher dmesoal space that they are much more lkely to be learly separable sgle perceptro ll do hs s very smlar to the dea of Support Vector Mache PR ANN & ML 5

Eample OR Warpped OR y e [00] t y 0 0 00 e t [] ] PR ANN & ML 6

More Eample PR ANN & ML 7

A Pure Iterpolato Approach ve: d = Desred: f = d Soluto: f th f = d Radal bass fucto soluto geeral form s shft ad rotato varat Shft varat requres - Rotato varat requres - Eample Multquadrcs Iserve Multquadrcs aussa f r r c r r c r r e PR ANN & ML 8

raphcal Iterpretato Each euro respods based o the dstace to the ceter of ts receptve feld he bottom level s a olear mappg he top level s a lear eghted sum f PR ANN & ML m 9

Other Alteratves: lobal Lagrage polyomals k y k L f y 0 k k k k k k o k k k o k k k k L y f y PR ANN & ML 0

Other Alteratves: Local Bezer Bass B-sple bass PR ANN & ML

B-Sple Iterpolato A bg subect mathematcs Used may dscples Appromato Patter recogto Computer graphcs As far as patter recogto s cocered Determe order of sple DOFs Kot vectors partto to tervals Fttg each terval PR ANN & ML

Iterpolato Soluto d f d d d D Φ W D ΦW s symmetrcal s vertable f all s are dstct PR ANN & ML 3 s vertable f all s are dstct

Practcal Issue: Accuracy cot. he fucto represets the ree s fucto for a certa dfferetal operator Whe t s shft ad rotatoal varat e ca rte as - aga aussa Kerel s a popular choce here PR ANN & ML 4

Accuracy Practcal Issues Ho about data are osy? Speed Ho about there are may sample pots? rag What s the trag procedure? PR ANN & ML 5

Practcal Issue: Accuracy Whe data are osy pure terpolato represets a form of overfttg Need a stablzg or smoothg regularzato term he soluto should acheve to thgs ood fttg Smoothess PR ANN & ML 6

Practcal Issue: Accuracy cot. Df d f m Df d * D W D W o o he soluto s rooted the regularzato theory hch s ay beyod the scope of ths course read hch s ay beyod the scope of ths course read the papers o the class Web stes for more detals ry to mmze error as a eghted sum of to PR ANN & ML 7 y g terms hch mpose the fttg ad the smoothess costrats

Sdebar I It ca be prove that MAP estmator Baysa rule gves the same results as regularzed RBF soluto U-regularzed fttg soluto assumes the same pror Df d f P f P f P f P c f P f P f P P log log log PR ANN & ML 8

Sdebar II Regularzato s also smlar to or call rdge regresso statstcs he problem here s to ft a model to data thout t overfttg I lear case e have rdge arg m y o rdge arg m y o subectto s PR ANN & ML 9

Ituto Whe varables are hghly correlated ther coeffcets become poorly determed th hgh varace E.g. dely large postve coeffcet o oe ca be caceled by a smlarly large egatve coeffcet o ts correlated cous Sze costrat s helpful Caveat: costrat s problem depedet PR ANN & ML 0

Soluto to Rdge Regresso Smlar to regularzato o rdge y arg m W W Y W Y W W W Y W Y W rdge d m arg W Y W W W Y W Y W d d 0 0 Y I Y W I W Y W rdge λ 0 PR ANN & ML Y I rdge λ

Ugly Math Y I rdge λ Y U VΣ I UΣ U VΣ UΣ Y I Y rdge λ V V λ Y U Σ V I UΣ U VΣ UΣ Y U VΣ I UΣ U VΣ UΣ λ V V λ V V Y U Σ I Σ Σ UΣ Y U Σ I V UΣ U VΣ V UΣ λ V λ V V Y u u λ PR ANN & ML λ

Physcal Iterpretato Sgular values of represets the spread of data alog dfferet body-fttg dmesos o estmate Y= rdge regularzato mmzes the cotrbuto from less spread-out dmesos Less spread-out dmesos usually have much larger varace hgh dmeso ege modes race +I - s called effectve degrees of freedom PR ANN & ML 3

More Detals race +I - s called effectve degrees of freedom Cotrols ho may ege modes are actually used or actve Dfferet methods are possble Shrkg smoother: cotrbutos t are scaled Proecto smoother: cotrbutos are used or ot used 0 PR ANN & ML 4

Practcal Issue: Speed Whe there are may trag samples ad matrces are of sze by Ivertg such a matr s of O 3 Reducg the umber of bases used PR ANN & ML 5

Practcal Issue: Speed cot. m * m< m Df d f * * m o D W m m m m m o m PR ANN & ML 6 m m m

Practcal Issue: rag Ho ca the ceter of radal bass fuctos for the reduced bass set be determed? Chose radomly rag volves fdg usg SVD PR ANN & ML 7

rag th K-mea Usg usupervsed clusterg Fd here data are clustered that t s here the radal bass fuctos should be placed Wth k-mea PR ANN & ML 8

K-Meas Algorthm fed # of clusters Arbtrarly pck N cluster ceters assg samples to earest ceter Compute sample mea of each cluster Reassg samples to clusters th the earest mea for all samples Repeat f there are chages otherse stop PR ANN & ML 9

PR ANN & ML 30

PR ANN & ML 3

PR ANN & ML 3

rag th radet Decet Error Epresso m d Free varables the error epresso are Weght Ceter locato Bass spread PR ANN & ML 33

Effect of Weghts m d m d d PR ANN & ML 34

Effect of Ceter Postos m d m d ' Σ PR ANN & ML 35

Effect of Bass Spread m d m d ' Q Q Σ Σ Σ Σ Σ PR ANN & ML 36 Σ

Detals A lot of theoretcal developmet results are omtted here E.g. relato to kerel regresso ad SVM A lot of tug cosderatos are ot covered here E.g. ho to determe? hs s a actve research area PR ANN & ML 37

Eamples PR ANN & ML 38 544000 data pots. 80000 ceters Accuracy of.40-6 for all data pots

Problem Defto ve a pot cloud of fdata From laser rage scaer or C MR etc. Fd a sgle aalytcal surface appromato Or a sde-outsde fucto Rage data are s=0 Outsde s s>0 Isde s s<0 Just sample data s=0 s ot eough s ca be a trval zero fucto Need off-surface data geerato PR ANN & ML 39

Procedures. Off surface data geerato. Choose a subset from the terpolato ode ad ft a RBF oly to these 3. Evaluate the resdeual e = f s 4. If mae <accuracy the stop 5. Else apped e ceters here e s large 6. Re-ft RBF ad go back to step PR ANN & ML 40

More Results Less smoothg More smoothg PR ANN & ML 4