Minimum Squared Error

Similar documents
Minimum Squared Error

5.1-The Initial-Value Problems For Ordinary Differential Equations

4.8 Improper Integrals

( ) ( ) ( ) ( ) ( ) ( y )

ENGR 1990 Engineering Mathematics The Integral of a Function as a Function

1. Find a basis for the row space of each of the following matrices. Your basis should consist of rows of the original matrix.

0 for t < 0 1 for t > 0

e t dt e t dt = lim e t dt T (1 e T ) = 1

Some Inequalities variations on a common theme Lecture I, UL 2007

1 jordan.mcd Eigenvalue-eigenvector approach to solving first order ODEs. -- Jordan normal (canonical) form. Instructor: Nam Sun Wang

Motion. Part 2: Constant Acceleration. Acceleration. October Lab Physics. Ms. Levine 1. Acceleration. Acceleration. Units for Acceleration.

The solution is often represented as a vector: 2xI + 4X2 + 2X3 + 4X4 + 2X5 = 4 2xI + 4X2 + 3X3 + 3X4 + 3X5 = 4. 3xI + 6X2 + 6X3 + 3X4 + 6X5 = 6.

An integral having either an infinite limit of integration or an unbounded integrand is called improper. Here are two examples.

Chapter 2: Evaluative Feedback

September 20 Homework Solutions

Contraction Mapping Principle Approach to Differential Equations

Chapter Direct Method of Interpolation

3D Transformations. Computer Graphics COMP 770 (236) Spring Instructor: Brandon Lloyd 1/26/07 1

MATH 124 AND 125 FINAL EXAM REVIEW PACKET (Revised spring 2008)

S Radio transmission and network access Exercise 1-2

1.0 Electrical Systems

INTEGRALS. Exercise 1. Let f : [a, b] R be bounded, and let P and Q be partitions of [a, b]. Prove that if P Q then U(P ) U(Q) and L(P ) L(Q).

P441 Analytical Mechanics - I. Coupled Oscillators. c Alex R. Dzierba

Magnetostatics Bar Magnet. Magnetostatics Oersted s Experiment

Collision Detection and Bouncing

More on Magnetically C Coupled Coils and Ideal Transformers

3. Renewal Limit Theorems

f(x) dx with An integral having either an infinite limit of integration or an unbounded integrand is called improper. Here are two examples dx x x 2

Lecture 2: Network Flow. c 14

3 Motion with constant acceleration: Linear and projectile motion

PHYSICS 1210 Exam 1 University of Wyoming 14 February points

Honours Introductory Maths Course 2011 Integration, Differential and Difference Equations

Some basic notation and terminology. Deterministic Finite Automata. COMP218: Decision, Computation and Language Note 1

Mathematics 805 Final Examination Answers

Introduction to Determinants. Remarks. Remarks. The determinant applies in the case of square matrices

Location is relative. Coordinate Systems. Which of the following can be described with vectors??

Average & instantaneous velocity and acceleration Motion with constant acceleration

EE 315 Notes. Gürdal Arslan CLASS 1. (Sections ) What is a signal?

ECE Microwave Engineering. Fall Prof. David R. Jackson Dept. of ECE. Notes 10. Waveguides Part 7: Transverse Equivalent Network (TEN)

T-Match: Matching Techniques For Driving Yagi-Uda Antennas: T-Match. 2a s. Z in. (Sections 9.5 & 9.7 of Balanis)

f t f a f x dx By Lin McMullin f x dx= f b f a. 2

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

Lecture 2e Orthogonal Complement (pages )

REAL ANALYSIS I HOMEWORK 3. Chapter 1

rank Additionally system of equation only independent atfect Gawp (A) possible ( Alb ) easily process form rang A. Proposition with Definition

A Kalman filtering simulation

Convergence of Singular Integral Operators in Weighted Lebesgue Spaces

FM Applications of Integration 1.Centroid of Area

A LIMIT-POINT CRITERION FOR A SECOND-ORDER LINEAR DIFFERENTIAL OPERATOR IAN KNOWLES

Bases for Vector Spaces

(b) 10 yr. (b) 13 m. 1.6 m s, m s m s (c) 13.1 s. 32. (a) 20.0 s (b) No, the minimum distance to stop = 1.00 km. 1.

Properties of Logarithms. Solving Exponential and Logarithmic Equations. Properties of Logarithms. Properties of Logarithms. ( x)

graph of unit step function t

A LOG IS AN EXPONENT.

MAT 266 Calculus for Engineers II Notes on Chapter 6 Professor: John Quigg Semester: spring 2017

PARABOLA. moves such that PM. = e (constant > 0) (eccentricity) then locus of P is called a conic. or conic section.

Graduate Algorithms CS F-18 Flow Networks

Online Convex Optimization Example And Follow-The-Leader

Forms of Energy. Mass = Energy. Page 1. SPH4U: Introduction to Work. Work & Energy. Particle Physics:

Solutions to assignment 3

22.615, MHD Theory of Fusion Systems Prof. Freidberg Lecture 9: The High Beta Tokamak

Lecture Solution of a System of Linear Equation

Chapter 7: Solving Trig Equations

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Physics 2A HW #3 Solutions

MTH 146 Class 11 Notes

Lecture 21: Order statistics

1B40 Practical Skills

The Finite Element Method for the Analysis of Non-Linear and Dynamic Systems

A 1.3 m 2.5 m 2.8 m. x = m m = 8400 m. y = 4900 m 3200 m = 1700 m

defines eigenvectors and associated eigenvalues whenever there are nonzero solutions ( 0

Math 315: Linear Algebra Solutions to Assignment 6

Let us start with a two dimensional case. We consider a vector ( x,

ECE Microwave Engineering

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

2.4 Linear Inequalities and Interval Notation

Lecture 10: Wave equation, solution by spherical means

Transforms II - Wavelets Preliminary version please report errors, typos, and suggestions for improvements

22.615, MHD Theory of Fusion Systems Prof. Freidberg Lecture 10: The High Beta Tokamak Con d and the High Flux Conserving Tokamak.

Definite Integrals. The area under a curve can be approximated by adding up the areas of rectangles = 1 1 +

Matrix Algebra. Matrix Addition, Scalar Multiplication and Transposition. Linear Algebra I 24

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Bridging the gap: GCSE AS Level

Numerical Analysis: Trapezoidal and Simpson s Rule

Reinforcement learning

1 Linear Least Squares

() t. () t r () t or v. ( t) () () ( ) = ( ) or ( ) () () () t or dv () () Section 10.4 Motion in Space: Velocity and Acceleration

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Version 001 test-1 swinney (57010) 1. is constant at m/s.

The order of reaction is defined as the number of atoms or molecules whose concentration change during the chemical reaction.

The Regulated and Riemann Integrals

Chapter 2. Motion along a straight line. 9/9/2015 Physics 218

378 Relations Solutions for Chapter 16. Section 16.1 Exercises. 3. Let A = {0,1,2,3,4,5}. Write out the relation R that expresses on A.

Observability of flow dependent structure functions and their use in data assimilation

Physic 231 Lecture 4. Mi it ftd l t. Main points of today s lecture: Example: addition of velocities Trajectories of objects in 2 = =

Review of Gaussian Quadrature method

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Math 1B, lecture 4: Error bounds for numerical methods

SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics

Flow Networks Alon Efrat Slides courtesy of Charles Leiserson with small changes by Carola Wenk. Flow networks. Flow networks CS 445

Transcription:

Minimum Squred Error

LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples y i solve sysem of liner equions Choose posiive consns,,, n ry o find weigh vecor s.. y i = i for ll smples y i If we cn find weigh vecor such h y i = i for ll smples y i, hen is soluion ecuse i s re posiive consider ll he smples (no jus he misclssified ones)

LDF: MSE Mrgins g(y) = y i y k Since we wn y i = i, we expec smple y i o e disnce i from he sepring hyperplne (normlized y ) Thus,,, n give relive expeced disnces or mrgins of smples from he hyperplne Should mke i smll if smple i is expeced o e ner sepring hyperplne, nd mke i lrger oherwise In he sence of ny ddiionl informion, here re good resons o se = = = n =

LDF: MSE Mrix Noion Need o solve n equions Inroduce mrix noion: n d d n n n d d y y y y y y y y y Y Thus need o solve liner sysem Y = n n y y

LDF: Exc Soluion is Rre Thus need o solve liner sysem Y = Y is n n y (d +) mrix Exc soluion cn e found only if Y is nonsingulr nd squre, in which cse he inverse Y - exiss = Y - (numer of smples) = (numer of feures + ) lmos never hppens in prcice in his cse, gurneed o find he sepring hyperplne y y

LDF: Approxime Soluion Typiclly Y is overdeermined, h is i hs more rows (exmples) hn columns (feures) If i hs more feures hn exmples, should reduce dimensionliy Y = Need Y =, u no exc soluion exiss for n overdeermined sysem of equion More equions hn unknowns Find n pproxime soluion, h is Y Noe h pproxime soluion does no necessrily give he sepring hyperplne in he seprle cse Bu hyperplne corresponding o my sill e good soluion, especilly if here is no sepring hyperplne

LDF: MSE Crierion Funcion Minimum squred error pproch: find which minimizes he lengh of he error vecor e e Y e Thus minimize he minimum squred error crierion funcion: n Y y J s i i Unlike he percepron crierion funcion, we cn opimize he minimum squred error crierion funcion nlyiclly y seing he grdien o i Y

LDF: Opimizing J s () J s Le s compue he grdien: J J s n Y y i i i J Y Y s Seing he grdien o : Y s d Y Y Y Y

LDF: Pseudo Inverse Soluion Mrix Y Y is squre (i hs d + rows nd columns) nd i is ofen non-singulr If Y Y is non-singulr, is inverse exiss nd we cn solve for uniquely: Y Y Y pseudo inverse of Y Y Y Y Y Y Y Y Y I

LDF: Minimum Squred-Error Procedures If = = n =, MSE procedure is equivlen o finding hyperplne of es fi hrough he smples y,,y n J Y s n n n Then we shif his line o he origin, if his line ws good fi, ll smples will e clssified correcly

LDF: Minimum Squred-Error Procedures Only gurneed he sepring hyperplne if Y > h is if ll elemens of vecor We hve Th is Y Y n e e n Thus in linerly seprle cse, les squres soluion does no necessrily gives sepring hyperplne Bu i will give resonle hyperplne Y y y n re posiive where e my e negive If e,, e n re smll relive o,, n, hen ech elemen of Y is posiive, nd gives sepring hyperplne If pproximion is no good, e i my e lrge nd negive, for some i, hus i + e i will e negive nd is no sepring hyperplne

LDF: Minimum Squred-Error Procedures We re free o choose. My e emped o mke lrge s wy o insure Y Does no work Le e sclr, le s ry insed of if * is les squres soluion o Y =, hen for ny sclr, les squres soluion o Y = is * rg min Y rg min Y / rg min * hus if for some i h elemen of Y is less hn, h is y i <, hen y i () <, Relive difference eween componens of mers, u no he size of ech individul componen Y /

LDF: Exmple Clss : (6 9), (5 7) Clss : (5 9), ( 4) Se vecors y, y, y 3, y 4 y dding exr feure nd normlizing 6 y 9 5 y y 3 7 5 9 y 4 4 Mrix Y is hen Y 6 5 5 9 7 9 4

LDF: Exmple Choose In ml, =Y\ solves he les squres prolem.. 7.9 Noe is n pproximion o Y =, since no exc soluion exiss Y... 3 4. 6 This soluion does give sepring hyperplne since Y >

LDF: Exmple Clss : (6 9), (5 7) Clss : (5 9), ( ) The ls smple is very fr compred o ohers from he sepring hyperplne y 6 9 y 5 7 y 3 5 9 y 4 Mrix Y 6 5 5 9 7 9

LDF: Exmple Choose In ml, =Y\ solves he les squres prolem 3..4. Noe is n pproximion o Y =, since no exc soluion exiss Y..6. 4. 9 This soluion does no give sepring hyperplne since y 3 <

LDF: Exmple MSE pys o much enion o isoled noisy exmples (such exmples re clled ouliers) oulier MSE soluion desired soluion No prolems wih convergence hough, nd soluion i gives rnges from resonle o good

LDF: Exmple we know h 4 h poin is fr fr from sepring hyperplne In prcice we don know his Thus pproprie In Ml, solve =Y\...9 7 Noe is n pproximion o Y =, Y.... 9 8 This soluion does give he sepring hyperplne since Y >

LDF: Grdien Descen for MSE soluion J s Y My wish o find MSE soluion y grdien descen:. Compuing he inverse of Y Y my e oo cosly. Y Y my e close o singulr if smples re highly correled (rows of Y re lmos liner cominions of ech oher) compuing he inverse of Y Y is no numericlly sle In he eginning of he lecure, compued he grdien: J Y Y s

LDF: Widrow-Hoff Procedure Thus he upde rule for grdien descen: k k k k Y Y If k / k J Y Y soluion, h is Y (Y-)= s weigh vecor (k) converges o he MSE Widrow-Hoff procedure reduces sorge requiremens y considering single smples sequenilly: k k k k y y i i i

LDF: Ho-Kshyp Procedure In he MSE procedure, if is chosen rirrily, finding sepring hyperplne is no gurneed Suppose rining smples re linerly seprle. Then here is s nd posiive s s.. Y s s If we knew s could pply MSE procedure o find he sepring hyperplne Ide: find oh s nd s Minimize he following crierion funcion, resricing o posiive :, Y J HK

LDF: Ho-Kshyp Procedure J HK As usul, ke pril derivives w.r.. nd J HK J HK, Y Y Y Y Use modified grdien descen procedure o find minimum of J HK (,) Alerne he wo seps elow unil convergence: ) Fix nd minimize J HK (,) wih respec o ) Fix nd minimize J HK (,) wih respec o

LDF: Ho-Kshyp Procedure JHK Y Y J HK Y Alerne he wo seps elow unil convergence: ) Fix nd minimize J HK (,) wih respec o ) Fix nd minimize J HK (,) wih respec o Sep () cn e performed wih pseudoinverse For fixed minimum of J HK (,) wih respec o is found y solving Thus Y Y Y Y Y

LDF: Ho-Kshyp Procedure Sep : fix nd minimize J HK (,) wih respec o We cn use = Y ecuse hs o e posiive Soluion: use modified grdien descen sr wih posiive, follow negive grdien u refuse o decrese ny componens of This cn e chieved y seing ll he posiive componens of J o No doing seepes descen nymore, u we re sill doing descen nd ensure h is posiive

LDF: Ho-Kshyp Procedure The Ho-Kshyp procedure: ) Sr wih rirry () nd () >, le k = repe seps () hrough (4) e k Y k k ) ) Solve for (k+) using (k) nd (k) 3) Solve for (k+) using (k+) 4) k = k + k k k k e e k k Y Y unil e (k) <= hreshold or k > k mx or (k+) = (k) For convergence, lerning re should e fixed eween < < Y

LDF: Ho-Kshyp Procedure In he linerly seprle cse, e (k) =, found soluion, sop one of componens of e (k) is posiive, lgorihm coninues In non seprle cse, e (k) will hve only negive componens evenully, hus found proof of nonsepriliy No ound on how mny ierion need for he proof of nonsepriliy

LDF: Ho-Kshyp Procedure Exmple Clss : (6 9), (5 7) Clss : (5 9), ( ) Mrix Sr wih Y 6 5 5 nd Use fixed lerning =.9 6 A he sr Y 5 3 9 7 9

LDF: Ho-Kshyp Procedure Exmple Ierion : e Y 6 5 3 solve for () using () nd ().9 e e solve for () using () Y Y Y.6.6.6 5 6.9 4.7..5.6.. 5 6.5. *. 5 6 8.6 8.6 34. 3.. 7 6 8

LDF: Ho-Kshyp Procedure Exmple Coninue ierions unil Y > In prcice, coninue unil minimum componen of Y is less hen. Afer 4 ierions converged o soluion 7 34..3. 3 9 does gives sepring hyperplne Y 8 3 47 7..48. 4. 5

LDF: MSE for Muliple Clsses Suppose we hve m clsses Define m liner discriminn funcions g i ( x) w x w i,...,m i i Given x, ssign clss c i if g i ( x) g ( x) j j i Such clssifier is clled liner mchine A liner mchine divides he feure spce ino c decision regions, wih g i (x) eing he lrges discriminn if x is in he region R i

LDF: MSE for Muliple Clsses For ech clss i, find weigh vecor i, s.. i i y y y y clss clss Le Y i e mrix whose rows re smples from clss i, so i hs d + columns nd n i rows i i Le s pile ll smples in n y d + mrix Y: Y Y Y Y m smple smple smple smple from from from from clss clss clss m clss m

LDF: MSE for Muliple Clsses Le i e column vecor of lengh n which is everywhere excep rows corresponding o smples from clss i, where i is : i rows corresponding o smples from clss i

LDF: MSE for Muliple Clsses Le s pile ll i s columns in n y c mrix B B n Le s pile ll i s columns in d + y m mrix A A m m LSE prolems cn e represened in YA = B: smple smple smple smple smple smple from from from from from from clss clss clss clss 3 clss 3 clss 3 = Y A B

LDF: MSE for Muliple Clsses Our ojecive funcion is: J m A i Y i i J(A) is minimized wih he use of pseudoinverse A Y Y YB

LDF: Summry Percepron procedures find sepring hyperplne in he linerly seprle cse, do no converge in he non-seprle cse cn force convergence y using decresing lerning re, u re no gurneed resonle sopping poin MSE procedures converge in seprle nd no seprle cse my no find sepring hyperplne if clsses re linerly seprle use pseudoinverse if Y Y is no singulr nd no oo lrge use grdien descen (Widrow-Hoff procedure) oherwise Ho-Kshyp procedures lwys converge find sepring hyperplne in he linerly seprle cse more cosly