Minimum Squred Error
LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples y i solve sysem of liner equions Choose posiive consns,,, n ry o find weigh vecor s.. y i = i for ll smples y i If we cn find weigh vecor such h y i = i for ll smples y i, hen is soluion ecuse i s re posiive consider ll he smples (no jus he misclssified ones)
LDF: MSE Mrgins g(y) = y i y k Since we wn y i = i, we expec smple y i o e disnce i from he sepring hyperplne (normlized y ) Thus,,, n give relive expeced disnces or mrgins of smples from he hyperplne Should mke i smll if smple i is expeced o e ner sepring hyperplne, nd mke i lrger oherwise In he sence of ny ddiionl informion, here re good resons o se = = = n =
LDF: MSE Mrix Noion Need o solve n equions Inroduce mrix noion: n d d n n n d d y y y y y y y y y Y Thus need o solve liner sysem Y = n n y y
LDF: Exc Soluion is Rre Thus need o solve liner sysem Y = Y is n n y (d +) mrix Exc soluion cn e found only if Y is nonsingulr nd squre, in which cse he inverse Y - exiss = Y - (numer of smples) = (numer of feures + ) lmos never hppens in prcice in his cse, gurneed o find he sepring hyperplne y y
LDF: Approxime Soluion Typiclly Y is overdeermined, h is i hs more rows (exmples) hn columns (feures) If i hs more feures hn exmples, should reduce dimensionliy Y = Need Y =, u no exc soluion exiss for n overdeermined sysem of equion More equions hn unknowns Find n pproxime soluion, h is Y Noe h pproxime soluion does no necessrily give he sepring hyperplne in he seprle cse Bu hyperplne corresponding o my sill e good soluion, especilly if here is no sepring hyperplne
LDF: MSE Crierion Funcion Minimum squred error pproch: find which minimizes he lengh of he error vecor e e Y e Thus minimize he minimum squred error crierion funcion: n Y y J s i i Unlike he percepron crierion funcion, we cn opimize he minimum squred error crierion funcion nlyiclly y seing he grdien o i Y
LDF: Opimizing J s () J s Le s compue he grdien: J J s n Y y i i i J Y Y s Seing he grdien o : Y s d Y Y Y Y
LDF: Pseudo Inverse Soluion Mrix Y Y is squre (i hs d + rows nd columns) nd i is ofen non-singulr If Y Y is non-singulr, is inverse exiss nd we cn solve for uniquely: Y Y Y pseudo inverse of Y Y Y Y Y Y Y Y Y I
LDF: Minimum Squred-Error Procedures If = = n =, MSE procedure is equivlen o finding hyperplne of es fi hrough he smples y,,y n J Y s n n n Then we shif his line o he origin, if his line ws good fi, ll smples will e clssified correcly
LDF: Minimum Squred-Error Procedures Only gurneed he sepring hyperplne if Y > h is if ll elemens of vecor We hve Th is Y Y n e e n Thus in linerly seprle cse, les squres soluion does no necessrily gives sepring hyperplne Bu i will give resonle hyperplne Y y y n re posiive where e my e negive If e,, e n re smll relive o,, n, hen ech elemen of Y is posiive, nd gives sepring hyperplne If pproximion is no good, e i my e lrge nd negive, for some i, hus i + e i will e negive nd is no sepring hyperplne
LDF: Minimum Squred-Error Procedures We re free o choose. My e emped o mke lrge s wy o insure Y Does no work Le e sclr, le s ry insed of if * is les squres soluion o Y =, hen for ny sclr, les squres soluion o Y = is * rg min Y rg min Y / rg min * hus if for some i h elemen of Y is less hn, h is y i <, hen y i () <, Relive difference eween componens of mers, u no he size of ech individul componen Y /
LDF: Exmple Clss : (6 9), (5 7) Clss : (5 9), ( 4) Se vecors y, y, y 3, y 4 y dding exr feure nd normlizing 6 y 9 5 y y 3 7 5 9 y 4 4 Mrix Y is hen Y 6 5 5 9 7 9 4
LDF: Exmple Choose In ml, =Y\ solves he les squres prolem.. 7.9 Noe is n pproximion o Y =, since no exc soluion exiss Y... 3 4. 6 This soluion does give sepring hyperplne since Y >
LDF: Exmple Clss : (6 9), (5 7) Clss : (5 9), ( ) The ls smple is very fr compred o ohers from he sepring hyperplne y 6 9 y 5 7 y 3 5 9 y 4 Mrix Y 6 5 5 9 7 9
LDF: Exmple Choose In ml, =Y\ solves he les squres prolem 3..4. Noe is n pproximion o Y =, since no exc soluion exiss Y..6. 4. 9 This soluion does no give sepring hyperplne since y 3 <
LDF: Exmple MSE pys o much enion o isoled noisy exmples (such exmples re clled ouliers) oulier MSE soluion desired soluion No prolems wih convergence hough, nd soluion i gives rnges from resonle o good
LDF: Exmple we know h 4 h poin is fr fr from sepring hyperplne In prcice we don know his Thus pproprie In Ml, solve =Y\...9 7 Noe is n pproximion o Y =, Y.... 9 8 This soluion does give he sepring hyperplne since Y >
LDF: Grdien Descen for MSE soluion J s Y My wish o find MSE soluion y grdien descen:. Compuing he inverse of Y Y my e oo cosly. Y Y my e close o singulr if smples re highly correled (rows of Y re lmos liner cominions of ech oher) compuing he inverse of Y Y is no numericlly sle In he eginning of he lecure, compued he grdien: J Y Y s
LDF: Widrow-Hoff Procedure Thus he upde rule for grdien descen: k k k k Y Y If k / k J Y Y soluion, h is Y (Y-)= s weigh vecor (k) converges o he MSE Widrow-Hoff procedure reduces sorge requiremens y considering single smples sequenilly: k k k k y y i i i
LDF: Ho-Kshyp Procedure In he MSE procedure, if is chosen rirrily, finding sepring hyperplne is no gurneed Suppose rining smples re linerly seprle. Then here is s nd posiive s s.. Y s s If we knew s could pply MSE procedure o find he sepring hyperplne Ide: find oh s nd s Minimize he following crierion funcion, resricing o posiive :, Y J HK
LDF: Ho-Kshyp Procedure J HK As usul, ke pril derivives w.r.. nd J HK J HK, Y Y Y Y Use modified grdien descen procedure o find minimum of J HK (,) Alerne he wo seps elow unil convergence: ) Fix nd minimize J HK (,) wih respec o ) Fix nd minimize J HK (,) wih respec o
LDF: Ho-Kshyp Procedure JHK Y Y J HK Y Alerne he wo seps elow unil convergence: ) Fix nd minimize J HK (,) wih respec o ) Fix nd minimize J HK (,) wih respec o Sep () cn e performed wih pseudoinverse For fixed minimum of J HK (,) wih respec o is found y solving Thus Y Y Y Y Y
LDF: Ho-Kshyp Procedure Sep : fix nd minimize J HK (,) wih respec o We cn use = Y ecuse hs o e posiive Soluion: use modified grdien descen sr wih posiive, follow negive grdien u refuse o decrese ny componens of This cn e chieved y seing ll he posiive componens of J o No doing seepes descen nymore, u we re sill doing descen nd ensure h is posiive
LDF: Ho-Kshyp Procedure The Ho-Kshyp procedure: ) Sr wih rirry () nd () >, le k = repe seps () hrough (4) e k Y k k ) ) Solve for (k+) using (k) nd (k) 3) Solve for (k+) using (k+) 4) k = k + k k k k e e k k Y Y unil e (k) <= hreshold or k > k mx or (k+) = (k) For convergence, lerning re should e fixed eween < < Y
LDF: Ho-Kshyp Procedure In he linerly seprle cse, e (k) =, found soluion, sop one of componens of e (k) is posiive, lgorihm coninues In non seprle cse, e (k) will hve only negive componens evenully, hus found proof of nonsepriliy No ound on how mny ierion need for he proof of nonsepriliy
LDF: Ho-Kshyp Procedure Exmple Clss : (6 9), (5 7) Clss : (5 9), ( ) Mrix Sr wih Y 6 5 5 nd Use fixed lerning =.9 6 A he sr Y 5 3 9 7 9
LDF: Ho-Kshyp Procedure Exmple Ierion : e Y 6 5 3 solve for () using () nd ().9 e e solve for () using () Y Y Y.6.6.6 5 6.9 4.7..5.6.. 5 6.5. *. 5 6 8.6 8.6 34. 3.. 7 6 8
LDF: Ho-Kshyp Procedure Exmple Coninue ierions unil Y > In prcice, coninue unil minimum componen of Y is less hen. Afer 4 ierions converged o soluion 7 34..3. 3 9 does gives sepring hyperplne Y 8 3 47 7..48. 4. 5
LDF: MSE for Muliple Clsses Suppose we hve m clsses Define m liner discriminn funcions g i ( x) w x w i,...,m i i Given x, ssign clss c i if g i ( x) g ( x) j j i Such clssifier is clled liner mchine A liner mchine divides he feure spce ino c decision regions, wih g i (x) eing he lrges discriminn if x is in he region R i
LDF: MSE for Muliple Clsses For ech clss i, find weigh vecor i, s.. i i y y y y clss clss Le Y i e mrix whose rows re smples from clss i, so i hs d + columns nd n i rows i i Le s pile ll smples in n y d + mrix Y: Y Y Y Y m smple smple smple smple from from from from clss clss clss m clss m
LDF: MSE for Muliple Clsses Le i e column vecor of lengh n which is everywhere excep rows corresponding o smples from clss i, where i is : i rows corresponding o smples from clss i
LDF: MSE for Muliple Clsses Le s pile ll i s columns in n y c mrix B B n Le s pile ll i s columns in d + y m mrix A A m m LSE prolems cn e represened in YA = B: smple smple smple smple smple smple from from from from from from clss clss clss clss 3 clss 3 clss 3 = Y A B
LDF: MSE for Muliple Clsses Our ojecive funcion is: J m A i Y i i J(A) is minimized wih he use of pseudoinverse A Y Y YB
LDF: Summry Percepron procedures find sepring hyperplne in he linerly seprle cse, do no converge in he non-seprle cse cn force convergence y using decresing lerning re, u re no gurneed resonle sopping poin MSE procedures converge in seprle nd no seprle cse my no find sepring hyperplne if clsses re linerly seprle use pseudoinverse if Y Y is no singulr nd no oo lrge use grdien descen (Widrow-Hoff procedure) oherwise Ho-Kshyp procedures lwys converge find sepring hyperplne in he linerly seprle cse more cosly