A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS. Dougsoo Kaown, B.Sc., M.Sc. Dissertation Prepared for the Degree of

Size: px
Start display at page:

Download "A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS. Dougsoo Kaown, B.Sc., M.Sc. Dissertation Prepared for the Degree of"

Transcription

1 A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS Dougsoo Kaown, B.Sc., M.Sc. Dssertaton Prepared for the Degree of DOCTOR OF PHILOSOPHY UNIVERSITY OF NORTH TEXAS May 2009 APPROVED: Janguo Lu, Major Professor John Neuberger, Commttee Member Xaohu Yuan, Commttee Member Matthew Douglass, Char of the Department of Mathematcs Mchael Montcno, Interm Dean of the Robert B. Toulouse School of Graduate Studes

2 Kaown, Dougsoo. A New Algorthm for Fndng the Mnmum Dstance between Two Convex Hulls. Doctor of Phlosophy (Mathematcs), May 2009, 66 pp., 24 llustratons, bblography, 27 ttles. The problem of computng the mnmum dstance between two convex hulls has applcatons to many areas ncludng robotcs, computer graphcs and path plannng. Moreover, determnng the mnmum dstance between two convex hulls plays a sgnfcant role n support vector machnes (SVM). In ths study, a new algorthm for fndng the mnmum dstance between two convex hulls s proposed and nvestgated. A convergence of the algorthm s proved and applcablty of the algorthm to support vector machnes s demostrated. The performance of the new algorthm s compared wth the performance of one of the most popular algorthms, the sequental mnmal optmzaton (SMO) method. The new algorthm s smple to understand, easy to mplement, and can be more effcent than the SMO method for many SVM problems.

3 Copyrght 2009 by Dougsoo Kaown

4 ACKNOWLEDGEMENTS I thank my advsor Dr. Lu, Janguo who gave me endless support and gudance. There are no words to descrbe hs support and gudance. I thank my PhD commttee members Dr. Neuberger and Dr. Yuan. I thank my father. Even though he already left our famly, I always thnk about hm. Wthout my father, my mother rased me wth her heart wth endless love. I thank my mother. She always supports me. I also thank my wfe and my daughter who always love me and are wth me. Wthout them, I could not fnsh ths pleasant and wonderful journey. I thank the Department of Mathematcs for gvng me ths opportunty and supportng me on ths journey. I thank all staff and faculty members n the department. They are my lfe tme teachers.

5 CONTENTS ACKNOWLEDGEMENTS CHAPTER 1. INTRODUCTION Introducton and Dscusson of the Problem Prevous Works 4 CHAPTER 2. A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS A New Algorthm for Fndng the Mnmum Dstance of Two Convex Hulls Convergence for Algorthm CHAPTER 3. SUPPORT VECTOR MACHINES Introducton for Support Vector Machnes (SVM) SVM for Lnearly Separable Sets SVM Dual Formulaton Lnearly Nonseparable Tranng Set Kernel SVM 32 CHAPTER 4. SOLVING SUPPORT VECTOR MACHINEs USING THE NEW ALGORITHM Solvng SVM Problems Usng the New Algorthm Stoppng Method for the Algorthm 37 CHAPTER 5. EXPERIMENTS Comparson wth Lnearly Separable Sets Comparson wth Real World Problems 39 CHAPTER 6. CONCLUSION AND FUTURE WORKS 56 v

6 6.1. Concluson Future Work 56 BIBLIOGRAPHY 64 v

7 CHAPTER 1 INTRODUCTION 1.1. Introducton and Dscusson of the Problem Fndng the mnmum dstance between two convex hulls s a popular research topc wth mportant real-world applcatons. The felds of robotcs, anmaton, computer graphcs and path plannng all use ths dstance calculaton. For example, n robotcs, the dstance between two convex hulls s calculated to determne the path of the robot so that t can avod collsons. However, computng the mnmum dstance between two convex hulls plays ts most sgnfcant role n support vector machnes (SVM) [6], [7], [10], [15]. SVM s a very useful classfcaton tool. SVM gets ts popularty because of ts mpressve performance n many real world problems. Many fast teratve methods are ntroduced to solve the SVM problems. The popular sequental mnmal optmzaton (SMO) method s one of them. There are also many algorthms for fndng the mnmum dstance between two convex hulls, ncludng the Glbert algorthm and Mtchell-Dem yanov-malozemov (MDM) algorthm. These two algorthms are the most basc algorthms for solvng the SVM problems geometrcally. Furthermore there have been many publshed papers based on these two algorthms. Keerth s paper [2] s an mportant example of ths. In ths work, he uses these two algorthms to produce a new hybrd algorthm, and he shows that the SVM problems can be transformed by computng the mnmum dstance between two convex hulls. Defnton 1.1 Let {x 1,..., x m } be a set of vectors n R n. Then the convex combnaton of {x 1,..., x m } s gven by (1) α x where α = 1 and α 0 1

8 Defnton 1.2 For a gven set X = {x 1,..., x m }, cox s the convex hull of X. cox s the set of all convex combnatons of elements of X. (2) cox = { α x α = 1, α 0} Glbert s algorthm and the MDM algorthm are easy to understand and mplement. However, these two algorthms are desgned for fndng the nearest pont to the orgn from the gven convex hull. Let X = {x 1,..., x m } and Y = {y 1,..., y s } be fnte pont sets n R n. U and V denote convex hulls that are generated by X and Y, respectvely. U = {u = α x α = 1, α 0} V = {v = j β j y j j β j = 1, β j 0} Then the Glbert and the MDM algorthms fnd the soluton for the followng problem : mn u subject to u = α x, α = 1, α 0 =1 Ths problem s called the mnmal norm problem (MNP). The soluton for the above MNP s the nearest pont to the orgn from U. For x, y R n, the Eucldean norm can be defned as follows. x = (x, x) where the nner product s defned by (x, y) = x t y = x 1 y x n y n. The mnmum dstance between two convex hulls can be found by solvng an optmzaton problem. Ths problem s called the nearest pont problem (NPP). A NPP problem can be seen n Fgure 1.1. By usng Glbert s algorthm or the MDM algorthm, a user of these algorthms can fnd the mnmum dstance between two convex hulls by fndng the nearest pont to the orgn from the convex hull that s generated by the dfference of vectors. mn u v Nearest Pont Problem (NPP) 2

9 subject to u = v = m α x, =1 s β j y j, j=1 m α = 1, α 0 =1 s β j = 1, β j 0 j=1 Fgure 1.1. Nearest Pont Problem Set Z = {x y j x X and y j Y }. Let W = coz. W = {w = l γ l z l l γ l = 1, γ l 0} Then the mnmum dstance between two convex hulls can be found by solvng the followng MNP. subject to mn w w = γ l z l l γ l = 1, γ l 0 l=1 However, computng the mnmum dstance between two convex hulls by usng such algorthms requres bg memory storage, because the convex hull W has m s vertces. For example, let X = {x 1, x 2 } and Y = {y 1, y 2, y 3 }, 3

10 then Z = X Y = {x 1 y 1, x 1 y 2, x 1 y 3, x 2 y 1, x 2 y 2, x 2 y 3 }. Ths means that f one set has 1000 ponts and the other set has 1000 ponts, then 1,000,000 ponts wll be used n the Glbert or the MDM method. If these methods are used to fnd the mnmum dstance between two convex hulls, the calculaton would be too expensve. Thus, a drect method wthout formng the dfference of vectors s desred. Ths dssertaton study presents a drect method of computng the mnmum dstance between two convex hulls. Chapter 2 proposes a new algorthm for fndng the soluton for NPP. Convergence s proved n chapter 2. Chapter 3 ntroduces Support Vector Machnes (SVM). Chapter 4 dscusses solvng SVM by usng the new algorthm. Chapter 5 shows the experment results comparng SMO and the new algorthm. Fnally, the concluson and the future work for the new algorthm s dscussed n chapter Prevous Works In ths secton, two mportant teratve algorthms for fndng the nearest pont to the orgn from a gven convex hull are dscussed begnnng wth Glbert s algorthm [5]. Glbert s algorthm was the frst algorthm developed for fndng the mnmum dstance to the orgn from a convex hull. Glbert s algorthm s a geometrc method used to fnd the mnmum dstance. Addtonally, Glbert s algorthm can be vewed geometrcally. Glbert s Algorthm can be explaned by usng convex hull representaton. Frst, let X = {x 1,..., x m } and U = {u = α x α = 1, α 0}. Defnton 1.3 For u U, defne (3) δ MDM (u) = (u, u) mn(x, u) Theorem 1 n [1] states that δ MDM (u) = 0 f and only f u s the nearest to the orgn. Glbert s algorthm 4

11 Step 1 Choose u k U. Step 2 Fnd (x k, u k ) = mn(x, u k ). Step 3 If δ MDM (u k ) = 0 then stop. Otherwse, fnd u k+1 where u k+1 s the pont of the lne segment jonng u k and x k that has the mnmum norm. Step 4 Go to Step 2. In Step 3, the mnmum norm of lne segment of u k and x k can be found n the followng way. The mnmum norm of lne segment of u k and x k s u k f ( u k, x k u k ) 0 x k f b a 2 ( u k, x k u k ) u k + ( u k,x k u k ) b a 2 otherwse Glbert s Algorthm tends to converge fast at the begnnng, but t s very slow as t approaches the soluton, because Glbert s Algorthm remans zgzaggng untl t fnds the soluton. Usng the fast convergence at the begnnng, Keerth [2] used t when he made a hybrd algorthm. After Glbert s algorthm, Mtchell-Dem yanov-malozemov (MDM) suggested a geometrc algorthm to fnd the nearest pont to the orgn. Lke Glbert s algorthm, the MDM algorthm s an teratve method to fnd the nearest pont from the orgn. Let X and U be the same as the prevous page. Let u be the nearest to the orgn. Defne (x, u) = max α >0 (x, u) where α [ s the coeffcent of x n the convex hull representaton and (x, u) = mn(x, u). Defnton 1.4 Defne (4) MDM (u) = (x, u) (x, u) Note that (5) (x, u) (x, u) = (x x, u) Lemma 2 n [1] states that α MDM (u) δ MDM (u) MDM (u) where α s the coeffcent of x n the convex hull representaton. 5

12 Then by theorem 1 and lemma 2 n [1], u s the nearest pont to the orgn f and only f MDM (u) = 0. Then the algorthm for fndng u s the followng. The MDM Algorthm Step 1 Choose u k U Step 2 Fnd x k and x k Step 3 Set u k (t) = u k + tα k (x k x k ) where α has the same ndex of x k and t [0, 1] Step 4 Set u k+1 = u k + t k α k (x k x k ) where (u k(t k ), u k (t k )) = mn t [0,1] (u k(t), u k (t)) Step 5 If MDM,k (u k ) = (x k x k, u k ) = 0 then stop. Otherwse go to Step 2. The convergence was proved n [1]. Ths algorthm s also very smple to mplement. In ths algorthm, u k needs to stay n the convex hull. In order to be nsde of the convex hull, the algorthm adds tα k x and subtracts tα k x k to the current pont so that the sum of α s 1. Here, t may take some tme to comprehend why only a nonzero α s used whle tryng to fnd x n the convex hull representaton. The MDM algorthm has a reason to use x when the current pont u k moves. If max(x, u) were used nstead of max (x, u), u k may converge α >0 slowly or may not converge to u. t k α k wll decde how far the current pont can move and x k x k wll gve the drecton for u k. Note that n each teraton, u k updates two coeffcents n the representaton. By takng ths method, four or more coeffcents are updated at a tme. The method wll be shown n the Appendx A. 6

13 CHAPTER 2 A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS 2.1. A New Algorthm for Fndng the Mnmum Dstance of Two Convex Hulls Ths chapter ntroduces a new algorthm for fndng the mnmum dstance of two convex hulls and a proof for ts convergence of the algorthm. Ths algorthm s an teratve method for computng the mnmum dstance between two convex hulls. Moreover, ths algorthm has a good applcaton to SVM. Let X, Y, Z, U, V and W be the same as n chapter 1. As mentoned n Chapter 1, fndng the nearest pont to the orgn of W s equvalent to solvng NPP. Let w be the nearest pont to the orgn and u v be the soluton for NPP, then w = u v. Note that for any w W, w = u v = α x j β j y j δ MDM (w) = (w, w) mn(z l, w) = (w, w) mn (x y j, w) l,j MDM (w) = max γ l >0 (z l, w) mn l (z l, w) = max α >0,β j >0 (x y j, w) mn,j (x y j, w) Mtchell-Dem yanov-malozemov defned MDM (w) and δ MDM n ther paper [1]. Defnton 2.1 For u U and v V, defne (6) x (u v) = max α >0 (x, u v) mn (x, u v) (7) y (v u) = max β j >0 (y j, v u) mn j (y j, v u) Actually, MDM (u v) = x (u v)+ y (v u) n order to show ths. Lemma 2.1 s needed. 7

14 Lemma 2.1 (8) max(x y j, u v) = max(x, u v) mn(y j, u v) (9) mn(x y j, u v) = mn(x, u v) max(y j, u v) (10) max(y j, u v) = mn(y j, v u) (Proof) (8) can be shown by nequaltes ( ). Let (x M y M, u v) = max(x y j, u v), then (x M y M, u v) = (x M, u v) (y M, u v). (x M, u v) max(x, u v) and (y M, u v) mn(y j, u v). Thus (x M y M, u v) = (x M, u v) (y M, u v) max(x, u v) mn(y j, u v). = max(x y j, u v) max(x, u v) mn(y j, u v). ( ) Let (x m, u v) = max(x, u v) and (y m, u v) = mn(y j, u v), then (x m, u v) (y m, u v) = (x m y m, u v). max(x y j, u v) (x m y m, u v) = max(x y j, u v) max(x, u v) mn(y j, u v) Proof of (9) s smlar to (8). Proof of (10) can be shown by nequaltes. Let (y M, u v) = max(y j, u v), then (y M, u v) = (y M, v u). = max(y j, u v) = (y M, v u) mn(y j, v u) Let (y mn, v u) = mn(y j, v u), then (y mn, v u) = (y mn, u v). = mn(y j, v u) = (y mn, u v) max(y j, u v). Lemma 2.2 (11) MDM (u v) = x (u v) + y (v u) (Proof) Snce u v W, MDM (u v) = max (x y j, u v) mn (x y j, u v) α >0,β j >0,j = max (x, u v) mn (y j, u v) mn(x, u v) + max(y j, u v) α >0 β j >0 j = max (x, u v) mn(x, u v) mn (y α >0 j, u v) + max(y j, u v) β j >0 j 8

15 = max (x, u v) mn(x, u v) + max (y α >0 j, v u) mn(y j, v u) β j >0 j = x (u v) + y (v u). Usng (8), (9) and (10), lemma 2.2 was shown. A new algorthm for fndng the mnmum dstance between two convex hulls uses the teratve method. In the new algorthm, the followng notatons are used. Let (x k, u k v k ) = max α k >0 (x k, u k v k ) and (x k, u k v k ) = mn (x k, u k v k ). (12) x (u k v k ) = (x k x k, u k v k ) (13) y (v k u k ) = (y k y k, v k u k ) Algorthm 2.1 Step 1 Choose u k U and v k V where u k = α k x and v k = j β jk y j. Step 2 Fnd x k and x k where (x k, u k v k ) = mn(x k, u k v k ) and (x k, u k v k ) = max (x α k >0 k, u k v k ). Step 3 u k+1 = u k + t k α k (x k x k ) where 0 t k 1 and t k = mn{1, Note t k s defned by x (u k v k ) α x k x 2 }. k (u k (t k ) v k, u k (t k ) v k ) = mn t [0,1] (u k(t) v k, u k (t) v k ) where u k (t) = u k + tα k (x k x k ) and (u k (t) v k, u k (t) v k ) = α k xk x k 2 t 2 2α k x (u k v k )t + (u k v k, u k v k ). Step 4 Fnd y jk and y j k where (y jk, v k u k+1 ) = mn(y jk, v k u k+1 ) j and (y j k, v k u k+1 ) = max (y j k, v k u k+1 ). β jk >0 Step 5 v k+1 = v k + t k β j k (y j k y j k ) Note t k where 0 t k 1 and t k = mn{1, y(v k u k+1 ) s defned by β j k yjk y j k 2 }. 9

16 (v k (t k ) u k+1, v k (t k ) u k+1) = mn t [0,1] (v k(t ) u k+1, v k (t ) u k+1 ) where v k (t ) = v k + t β j k (y jk y j k ) and (v k (t ) u k+1, v k (t ) u k+1 ) = β j k y jk y j k 2 t 2 2t β j k y (v k u k+1 )+(v k u k+1, v k u k+1 ). Step 6. Iteratng ths way, a sequence {u k v k } s obtaned such that u k+1 v k+1 u k v k snce u k+1 v k u k v k and u k+1 v k+1 u k+1 v k. Fgure 2.1. Algorthm 2.1 Fgure 2.1 descrbes how to fnd new u and new v. Frst fnd new u by usng old u, old v, then fnd new v by usng old v and new u. Iterated ths way, ths algorthm fnds the soluton u v for NPP. 10

17 2.2. Convergence for Algorthm 2.1 Before provng the convergence for Algorthm 2.1, let s defne the followng. Defnton 2.2 For u U and v V, defne (14) δ x (u v) = (u, u v) mn (x, u v) (15) δ y (v u) = (v, v u) mn j (y j, v u) Lemma 2.3 Show that u v (u v ) δ x (u v) + δ y (v u). (Proof) By lemma 2 n [1] states that for any w W, w w δ MDM (w) Note that snce u v W, δ MDM (u v) = (u v, u v) mn,j (x y j, u v) = (u, u v) (v, u v) mn(x, u v) mn(y j, v u) j = (u, u v) mn(x, u v) + (v, v u) mn(y j, v u) j = δ x (u v) + δ y (v u). Then u v (u v ) δ MDM (u v) = δ x (u v) + δ y (v u). Lemma 2.4 Suppose u k U, v k V, k = 1, 2... s a sequence of ponts such that u k+1 v k+1 u k v k and there s a subsequence such that δ x (u kj v kj ) 0 and δ y (v kj u kj ) 0 then u k v k u v = w. (Proof) Snce u k v k s a nonncreasng sequence bounded below, t has a lmt u k v k µ. By lemma 2.3, ukj v kj (u v ) 0 snce δx (u kj v kj ) 0 and δ y (v kj u kj ) 0. Ths mples ukj v kj u v = w. Hence u k v k u v = w. Any convergent subsequence of u k v k must converge to w = u v because of unqueness of w = u v. 11

18 Lemma 2.3 and 2.4 are mportant lemmas to prove the convergence of Algorthm 2.1. After the followng theorem, support vectors wll be defned. Support vectors are sgnfcant concept n ths dssertaton and support vector machnes (SVM). Theorem 1 Show (16) mn (x, u v ) = (u, u v ) (17) mn j (y j, v u ) = (v, v u ) (proof) Let u = α x and v = j β j y j. Note mn(x v, u v ) (u v, u v ) and mn(y j u, v u ) (v u, v u ). j Then mn(x, u v ) (v, u v ) (u, u v ) (v, u v ). = mn(x, u v ) (u, u v ) = ( α x, u v ) but mn(x, u v ) α (x, u v ). Thus mn(x, u v ) = (u, u v ). Also mn(y j, v u ) (u, v u ) (v, v u ) (u, v u ). j = mn(y j, v u ) (v, v u ) = ( βj yj, v u ) j j but mn(y j, v u ) βj (yj, v u ). j j Thus mn(y j, v u ) = (v, v u ). j Defnton 2.3 Let the soluton of NPP be u v then those x s correspondng to α > 0 and y j s correspondng to β j > 0 are called support vectors. As seen n Fgure 2.1, u and v can be represented by support vectors x 1, x 2, y 1 and y 6. Subtract (v, u v ) on both sdes from (16), then Smlarly from (17), mn (x v, u v ) = (u v, u v ). mn j (y j u, v u ) = (v u, v u ). 12

19 Fgure 2.2. Support Vectors Lemma 2.5 For any vector u v, u U, v V (18) α x (u v) δ x (u v) x (u v) where α > 0 and (x, u v) = max α >0 (x, u v). (19) β j y (v u) δ y (v u) y (v u) where β j > 0 and (y j, v v) = max (y j, v u). β j >0 m (Proof of (18)) Let u = α x, then =1 (u v, u v) = m =1 α (x v, u v) max α >0 (x v, u v). δ x (u v) x (u v) 13

20 Let (x, u v) = mn(x, u v) and (x, u v) = max (x, u v) α >0 Then x (u v) = (x x, u v). Set A = (α 1,..., α m ) where α = α f, α = 0 f = α = α + α f =. u = u + α (x x ) (u v, u v) = (u + α (x x ) v, u v) = (u, u v) + α ((x x ) v, u v) = (u, u) (u, v) + α ((x x ), u v) (v, u v) = (u, u) (u, v) α x (u v) (v, u) + (v, v) so = (u v, u v) α x (u v). δ x (u v) = (u v, u v) mn(x v, u v) (u v, u v) (u v, u v). But the rght hand sde of nequty s equal to (u v, u v) (u v, u v) + α x (u v). δ x (u v) α x (u v) Proof of (19) s smlar to the proof of (18). Lemma 2.6 (20) lm k α k x (u k v k ) = 0 (21) lm k β k y (v k u k ) = 0 14

21 (Proof of (20)) Frst, observe that u k (t) = u k + tα k (x k x k ), v k(t ) = v k + t β k (y k y k ) and u k+1 v k+1 u k v k. Note that (u k (t) v k, u k (t) v k ) = (u k v k, u k v k ) 2tα x(u k v k ) + t 2 (α x k x k ). Suppose the asserton false. Then there s a subsequence u kj v kj for whch α k j x (u kj v kj ) ɛ > 0. Then (u kj (t) v kj, u kj (t) v kj ) (u kj v kj, u kj v kj ) 2tɛ + t 2 d 2 where d = max l x p. l,p Because rght hand sde of nequalty s mnmzed at t = mn{ ε, 1} d 2 (2.6.1) If t = ε d 2, (u kj (t) v kj, u kj (t) v kj ) (u kj v kj, u kj v kj ) 2 ε d 2 ε + ( ε d 2 ) 2 d 2 (2.6.2) If t = 1( ε d 2 > 1 ε > t d 2 ) By (2.6.1) and(2.6.2) = (u kj v kj, u kj v kj ) ε2 d 2 = (u kj v kj, u kj v kj ) t ε. (u kj (t) v kj, u kj (t) v kj ) (u kj v kj, u kj v kj ) 2t ε + t 2 d 2 (u kj v kj, u kj v kj ) 2t ε + εt = (u kj v kj, u kj v kj ) t ε. (u kj (t) v kj, u kj (t) v kj ) (u kj v kj, u kj v kj ) t ε. Ths contradcts u k+1 v k+1 u k v k. Proof of (21) s smlar to (20). Lemma 2.7 (22) lm x (u k v k ) = 0 (23) lm y (v k u k ) = 0 (proof) Suppose lm x (u k v k ) 0 and lm y (v k u k ) 0 then there s x(u v ) such that lm x (u k v k ) = x(u v ). 15

22 Then for suffcently large k > K 1 (24) x (u k v k ) x(u v ). 2 And there s y(v u ) such that lm y (v k u k ) = y(v u ). Then for suffcently large k > K 2 (25) y (v k u k ) y(v u ). 2 By lemma 2.6, α k 0 Let t k be a pont that (u k (t) v k, u k (t) v k ) assumes ts global mnmum. Then (u k (t) v k, u k (t) v k ) = (u k + tα (x k x k) v k, u k + tα (x k x k) v k ) = (u k v k + tα (x k x k), u k v k + tα (x k x k)) = (u k v k, u k v k ) 2tα (u k v k, (x k x k )) + t 2 (α x k x k ) 2 = (u k v k, u k v k ) 2tα x (u k v k ) + t 2 (α x k x k ) 2. (26) t k = x(u k v k ) α x k x k 2 By x (u k v k ) x (u v ) 2 and α k 0, t k. Agan by lemma 2.6 β k 0 and let t k be a pont that that (v k (t ) u k, v k (t ) u k ) assumes ts global mnmum. Then (27) t k = y(v k u k ) β y k y k Agan by y (v k u k ) y (v u ) 2 and β k 0, t k. Hence, for large k > K > K 1, the mnmum of (u k (t) v k, u k (t) v k ) on the segment 0 t 1 s obtaned at t k = 1 so that for such values k (28) u k+1 = u k + α k(x k x k). 16

23 Also, for large k > K > K 2, the mnmum of (v k (t ) u k, v k (t ) u k ) on the segment 0 t 1 obtaned at t k = 1 so that for such values k (29) v k+1 = v k + β k(y k y k). Let u kj v kj be a subsequence such that (30) x (u kj v kj ) x(u v ). By dscardng terms from the sequence u kj v kj, the followng condtons are satsfed. (31) α k j α and β k j β where α and β are coeffcent for u and v respectvely. (32) x kj = x X Ths means that mn(x, u kj v kj ) s obtaned for all k j at the same value of (x, u kj v kj ) = mn (x, u kj v kj ) (33) x k j = x X (34) (x, u kj v kj ) = max α >0 (x, u kj v kj ) Usng (30), (31), (32) and (33), the followng s obtaned (x x, u v ) = x(u v ) Now ntroduce the new notatons, (35) ρ x = mn(x, u v ) (36) X 1 = {x X (x, u v) = ρ x} (37) X 2 = X \ X 1 Note that (38) x X 2 (x, u v ) > ρ x 17

24 (39) x X 1 and Set x X 2 (40) ρ x = mn (x, u v ) x X2 and (41) ρ x > ρ x then (42) τ x = mn{ x(u v ), ρ x ρ x}. Note that for x X 2, (x, u v ) ρ x + τ x. Choose δ ox such that whenever (u v) (u v ) < δ ox, (43) max (x, u v) (x, u v ) < τ x 4 (Clam) Let k j be such that u kj v kj (u v ) < δ ox. Then X 1 (u kj v kj ) X 1 where X 1 (u kj v kj ) = {x X (x, u kj v kj ) = mn(x, u kj v kj )}. (proof of clam) Let x X 1 (u kj v kj ), then (x, u v ) = (x, u kj v kj ) + (x, (u v ) (u kj v kj )) = mn (x, u kj v kj ) + (x, (u v ) (u kj v kj )) = mn (x, u kj v kj ) + (x, u v ) (x, (u kj v kj )) = ρ x ρ x + mn (x, u kj v kj ) + (x, u v ) (x, u kj v kj ) = ρ x + (x, (u v ) (u kj v kj )) + [ mn(x, u v ) + mn(x, u kj v kj )] ρ x + (x, (u v ) (u kj v kj )) + max (x, u kj v kj ) (x, u v ) ρ x + τ x /4 + τ x /4 = ρ x + τ x /2. 18

25 Snce δ Ox was chosen arbtrarly and (x, u v ) ρ x + τ x /2. x X 1 (end of clam) Let k j > K be such that the followng condtons hold. (44) x (u k v k ) x(u v ) τ x /4 (45) u kj v kj (u v ) < δ ox /2 (46) where p α k j +l 1 < δ ox /4d snce α k 0 l=1 d x = max x l x r > 0, d y = max y l y r and d = {d x, d y } l,r l,r (47) p β k j +l 1 < δ ox /4d snce β k 0 l=1 By (28) and (46), (48) u kj +p = u kj + p α k j +l 1(x kj +l 1 x k j +l 1) l=1 (49) u kj +p u kj < δ o x 4d d = δ o x /4 By (29) and (47), v kj +p = v kj + p β k j +l 1(y kj +l 1 y k j +l 1) l=1 v kj +p v kj < δ o x 4d d = δ o x /4. Then by (46) u kj +p v kj +p (u v ) u kj +p v kj +p (u kj v kj ) + u kj v kj (u v ) < δ ox 19

26 Hence X 1 (u kj +p v kj +p) X 1. Next for all x X 1 and p [1 : m], the followngs are obtaned (x, u kj +p v kj +p) = (x, u v ) + (x, u kj +p v kj +p (u v )) = (x, u v ) + (x, u kj +p v kj +p) (x, u v ) ρ x + τ x /4 x(u v ) τ x + ρ x + τ/4 = x(u v ) + ρ x 3τ x 4 (50) (x, u kj +p v kj +p) x(u v ) + ρ x 3τ x 4. Now show that for some vector u kj +p v kj +p, p [1 : m] max (x, u kj +p v kj +p) x(u v ) + ρ { α k j +p x 3τ x >0} 4. If ths nequalty fals to hold for some p, then x k j +p X 2. Snce X 1 (u kj +p v kj +p) X1, u kj +p+1 = u kj +p + α k j +p (x k j +p x k j +p ) nvolves the vector x k j +P wth zero coeffcents, whle the newly ntroduced vector x k j +p satsfes (50). Snce X 2 contans at most m-1 vectors, there s p [1 : m] such that all vectors appearng n the representaton hold (50). On the other hand, max α >0 (x, u kj +p v kj +p) x(u v ) + ρ x 3τ x 4 (51) mn(x, u kj +p v kj +p) ρ x τ x /4. Thus x (u kj +p v kj +p) = max α >0 (x, u kj +p v kj +p) mn(x, u kj +p v kj +p) x(u v ) τ x /2. Ths contradcts (44), Thus lm x (u k v k ) = 0. Smlarly, lm y (v k u k ) = 0 can be shown. Theorem 2 x (u v) + y (v u) = 0 f and only f u v = u v. 20

27 (proof) By lemma 2.2, x (u v) + y (v u) = MDM (u v). And by theorem 1 and lemma 2 n [1], MDM (u v) = 0 f and only f u v = u v. Thus, theorem has been proved. Theorem 3. The sequence u k v k converges to u v. (proof) By Lemma 2.7, there are subsequence u kj v kj and v kj u kj such that x (u kj v kj ) 0 and y (v kj u kj ) 0. Then by Lemma 2.3, 2.4 and 2.5 u k v k converges to u v. Theorem 4. Let (52) G = {u = α x (u, u v ) = (u, u v )} (53) G = {v = j β j x j (v, v u ) = (v, v u )} Then for large k, u k G and v k G and (proof) By theorem 1, mn (x, u v ) = (u, u v ) and mn j (y j, v u ) = (v, v u ). Set If X 2 = then u k G, k. If Y 2 = then v k G, k. X 1 = {x X (x, u v ) = (u, u v )}, X 2 = X \ X 1 = {x X (x, u v ) > (u, u v )}, Now suppose X 2, Y 2 Let Y 1 = {y j Y (y j, v u ) = (v, v u )} Y 2 = {y j Y (y j, v u ) > (v, v u )}. (54) τ = mn x X 2 (x, u v ) (u, u v ) > 0 21

28 and (55) τ = mn y j Y 2 (y j, v u ) (v, v u ) > 0 Note that (x v k, u k v k ) (x v k, u v ) and (y j u k, v k u k ) (y j u k, v u ) snce u k v k v u, ths follows that for large k > K 1 (56) max (x v k, u k v k ) (x v k, u v ) < τ 4 (57) max (y j u k, v k u k ) (y j u k, v u ) < τ j 4 Then by the clam n lemma 2.7 for large k > K 1 X 1 (u k v k ) X 1, where X 1 (u k v k ) = {x X (x, u k v k ) = mn (x, u k v k )} and Y 1 (v k u k ) Y 1, where Y 1 (v k u k ) = {y j Y (y j, v k u k ) = mn j (x j, v k u k )}. Now let x X 1 then (58) (x, u v ) = (u, u v ) By (56), max (x, u k v k ) (x, u v ) < τ. Then 4 (59) (x, u k v k ) (x, u v ) < τ 4 By (58), (x, u k v k ) (u, u v ) < τ 4. Then (60) (x, u k v k ) < (u, u v ) + τ 4 Now let x X 2, then by (54) (61) (x, u v ) (u, u v ) τ and by (56) (62) τ 4 < (x, u k v k ) (x, u v ) < τ 4 Add (x, u k v k ) (x, u v ) to (61) on both sdes, Then 22

29 (x, u k v k ) (x, u v ) + (x, u v ) (u, u v ) τ + (x, u k v k ) (x, u v ). And by (62), (63) (x, u k v k ) (u, u v ) + 3τ 4. Smlarly, (64) (y j, v k u k ) (v, v u ) + τ 4 f y j Y 1 and (65) (y j, v k u k ) (v, v u ) + 3τ 4 f y j Y 2. are obtaned. For u k, k > K 1, f the convex hull representaton for u k nvolves x X 2 wth a nonzero coeffcent, then x (u k v k ) τ 2 by (60), (63) and X 1(u k v k ) X 1. Smlarly, for v k, k > K 1, nvolves y j Y 2 wth a nonzero coeffcent, then (66) y (u k v k ) τ 2. Then for large k > K 1 and k > K 1, u k and v k can be wrtten as below (67) u k = { x X 1 } (68) v k = Consder (u k u, u v ) { x X 2 } {j y j Y 1 } α (k) x + { x X 2 } β (k) j y j + {j y j Y 2 } α (k) x β (k) j y j = (u k, u v ) (u, u v ) = ( α (k) x + α (k) x, u v ) (u, u v ) { x X 1 } { x X 2 } = ( α (k) x, u v ) + ( α (k) x, u v ) (u, u v ) { x X 1 } { x X 2 } = α (k) (x, u v ) + α (k) (x, u v ) { x X 1 } { x X 2 } α (k) (u, u v ) α (k) (u, u v ) { x X 1 } { x X 2 } = α (k) (x u, u v ) τ α (k) { x X 2 } 23

30 snce (x, u v ) = (u, u v ) f x X 1 and (x u, u v ) τ f x X 2. By the convergence of u k v k, the left hand sde of ths nequalty tends to zero as k Ths means that α (k) 0. (69) Smlarly, { x X 2 } (v u, v u ) = β (k) j (y j u, v u ) τ {j y j Y 2 } Also, {j y j Y 2 } β (k) j 0 can be seen. {j y j Y 2 } Choose K > K 1 so large and for k K, where d = { X 2 } β (k) j α (k) τ 2d 2 max x x j > 0. x X 1, x j X 2 Let t k be a pont that (u k (t) v k, u k (t) v k ) assumes a global mnmum. If the representaton of u k, k K, nvolves x X 2 wth a nonzero coeffcent then t k = x(u k v k ) τ α k x k x k 2 2 ( { x X 2 } α (k) )d 2 1 by x (u k v k ) τ 2. Hence u k+1 = u k + α k (x k x k ). Smlarly, choose K > K 1 so large and for k K then v k+1 = v k + β k (y k y k ). Now, for u k+1, x k X 1 meanwhle x k wll be dsappeared n the representaton of u k+1. (.e x k has zero coeffcent α k n u k+1) Then there s l {1,..., m} such that u k+l does not nvolve the vector x X 2 so there k > K + l such that u k G. Smlarly, there s some k such that v k G. 24

31 CHAPTER 3 SUPPORT VECTOR MACHINES 3.1. Introducton for Support Vector Machnes (SVM) Support vector machnes (SVM) s a famly of learnng algorthms used for the classfcaton of objects nto two classes. The theory s developed by Vapnk and Chervonenks. SVM became very popular n the early 90 s. SVM has been broadly appled to all knds of classfcatons from handwrtten recognton to face detecton n mage. SVM has ganed popularty because t can be appled to a wde range of real world applcatons and computatonal effcency. Moreover, t s theoretcally robust. In lnearly separable data sets, there are many decson boundares. In order to have a good generalzaton for a new decson, a good decson boundary s needed. The decson boundares of two sets should be as far away from two classes as possble n order to make an effectve generalzaton. The shortest dstance between two boundares s called a margn. SVM maxmzes the margn so that the maxmzed margn results n less errors for a future test, whle other methods only reduce classfcaton errors. In ths vew, SVM enables more generalzaton than other methods. To llustrate ths, consder the Fgure 3.1. Lne (1) and (2) have zero errors for classfyng the two groups, but the goal of ths classfcaton s to classfy future nputs. If test pont T belongs to O group, then (1) and (2) are not the same any more. Ths means that (2) s a better generalzaton than (1). Maxmal margn can be obtaned by solvng a quadratc programmng (QP) optmzaton problem. The optmal hyperplane whch has the maxmal margn can be obtaned by solvng the quadratc programmng optmzaton problem. The SVM QP optmzaton problem wll be ntroduced n Secton 3.2. A classfer wll be defned n Secton 3.2 as well. In 1992, Bernhard Boser, Isabelle Guyon and Vapnk added a kernel trck whch wll be ntroduced 25

32 Fgure 3.1. SVM Generalzaton n Secton 3.5 to lnear classfer. Ths kernel trck makes lnear classfcaton possble n a feature space that s usually bgger than the orgnal space SVM for Lnearly Separable Sets Ths secton wll ntroduce the most basc SVM problem: SVM wth lnearly separable sets. Let S be a tranng of ponts x R m wth t { 1, 1} for = 1,..., N, then S = {(x 1, t 1 ),..., (x N, t N )}.e If t = +1 then x belongs to the frst class. If t = 1 then x belongs to the second class. Let S be lnearly separable, then there s a hyperplane wx + b = 0 where w n R m and wx + b = 0 correctly classfes all n S. Defne classfer f(x ) = wx + b, f(x ) = 1 f t = 1 and f(x ) = 1 f t = 1..e wx + b > 0 f t = +1 wx + b < 0 f t = 1 26

33 Fgure 3.2. Lnear Classfer Even f there s a classfer that separates the tranng set correctly, a maxmal margn s needed. In order to separate the two sets correctly and a have better generalzaton, t s necessary to maxmze the margn. To compute an optmal hyperplane, consder the followng nequaltes for any = 1,..., N wx + b > 1 f t = +1 wx + b < 1 f t = 1 These two nequaltes are decson boundares. To compute the margn, consder the followng two hyperplanes : wx 1 + b = 1 wx 2 + b = 0 where dstance from x 1 to x 2 s the shortest dstance of two hyperplanes. Ths means that the drecton x 2 x 1 s parallel to w. Subtractng the frst from the second equalty, w(x 2 x 1 ) = 1 s obtaned, snce w and x 2 x 1 have the same drecton, x 2 x 1 = 1 w. In order to mnmze the error, x 2 x 1 should be maxmzed. Maxmzng x 2 x 1 s equvalent to mnmzng w. Thus, the optmal hyperplane can be found by solvng the followng optmzaton problem. (70) mn 1 2 w 2 27

34 Fgure 3.3. Classfer wth Margn subjec to t (w T x + b) SVM Dual Formulaton Ths secton wll ntroduce the dual formulaton of SVM optmzaton problem. The dual formulaton s usually effcent n mplementaton. Also, n dual formulaton, a non-separable set wll be more generalzed. The most mportant aspect of dual formulaton s the kernel trck. The kernel trck can be used n the dual formulaton. The last part of ths secton wll show the dual formulaton usng KKT condton n optmzaton theory. Let s consder the Lagrangan functon from (70), (71) L(w, b, λ) = 1 2 w 2 N λ [t (w T x + b) 1] =1 where λ = (λ 1,...λ N ) and dfferentate wth respect to the orgnal varables. (72) (73) N L w (w, b, λ) = w λ t x = 0. =1 N L b (w, b, λ) = λ t = 0. =1 28

35 From (72), (74) w = Substtute (74) nto (71) N λ t x. =1 L(w, b, λ) = 1 2 N N t t j λ λ j x x j =1 j=1 N N N t t j λ λ j x x j b λt + =1 j=1 =1 N =1 λ = 1 2 Thus SVM-DUAL s the followng. N N t t j λ λ j x x j + =1 j=1 N λ. =1 (75) max 1 2 N N t t j λ λ j x x j + =1 j=1 N =1 λ N s.t λ t = 0 =1 λ 0 Once λ s found, (w, b) can be found easly by usng w = N λ t x and wx + b = 1 for x n the frst class. = Lnearly Nonseparable Tranng Set In the case of a lnearly nonseparable tranng set, lnear classfer wx + b = 0 cannot separate the tranng set correctly. Ths means that the lnear classfer needs to allow some errors. The followng three cases wll happen wth lnear classfer wx + b = 0. Case 1 : Satsfy t (wx + b) 1 Case 2 : 0 t (wx + b) < 1 Case 3 : t (wx + b) < 0 Case 2 and case 3 have errors whle Case 1 does not have an error. These errors can be measured usng slack varable s. In Case 1, s = 0. In Case 2, 0 < s < 1. And n Case 3, s > 1. Fgure 3.4 shows the case of a lnearly nonseparable set. 29

36 Fgure 3.4. Lnearly Nonseparable Set In the case of a lnearly nonseparable set, the margn s maxmzed whle the number of s > 0 s mnmzed. So there are two goals. The frst goal s to maxmze the margn, and the second goal s to mnmze the number of s > 0. Now, consder (76) 1 2 w 2 + C N =1 s In (76), C was ntroduced. Let s thnk of two extreme cases. When C = 0, the second goal can be gnored. Ths s the case of a lnearly separable set. When C, the margn s gnored. C s a parameter that the user of SVM can choose to tune the trade off between the wdth of margn and the number of s. So the optmzaton problem for nonlnearly separable case s the followng : (77) mn 1 2 w 2 + C N =1 s s.t t (wx + b) 1 s s 0 30

37 = 1, 2,..., N Now let s consder Lagrangan functon from (77). (78) L(w, b, s, λ, µ) = ( 1 N N 2 w 2 + C s ) [ λ (t (w T x + b) 1 + s ) + =1 =1 And dfferentate wth respect to the orgnal varables. N µ s ] =1 (79) (80) (81) N L w (w, b, s, λ, µ) = w λ t x = 0 =1 N L b (w, b, s, λ, µ) = λ t = 0 =1 L (w, b, s, λ, µ) = C λ µ = 0 s From (78), (82) w = Substtute (82) nto (78), N λ t x =1 L(w, b, s, λ, µ) N N t t j λ λ j x x j + C N s N N t t j λ λ j x x j b N λt N λ s N µs + N = 1 2 =1 j=1 = 1 2 N =1 j=1 =1 j=1 =1 =1 j=1 =1 N t t j λ λ j x x j + N λ + C N s N λ s N µs =1 By (81) N N = 1 t 2 t j λ λ j x x j + N λ =1 Thus, Daul-SVM lnearly separable s obtaned as follows =1 =1 =1 =1 =1 λ =1 (83) max 1 2 N N t t j λ λ j x x j + =1 j=1 N =1 λ s.t N λ t = 0 =1 0 λ C = 1, 2,..., N 31

38 3.5. Kernel SVM In many problems, a lnear classfer could not have both hgh performance and good generalzaton n the orgnal space. Fgure 3.5. Nonlnear Classfer In Fg 3.5, a lnear classfer cannot separate two data sets correctly, so a nonlnear classfer must be consdered. As a result, a nonlnear classfer produces an effcent generalzaton for the future test. As t s shown n the above Fgure 3.5, the nonlnear classfer has a good performance (.e., generalzaton.) Then how does SVM separate a nonlnear separable data set? Consder the tranng set S = {(x 1, t 1 ),..., (x N, t N )} whch s not lnearly separable. However S can be lnearly separable n a dfferent space whch s usually a hgher dmenson than the orgnal space usng the set of real functon φ 1,..., φ M. Here, assume φ to be unknown. These functons are called features. Usng φ, x = (x 1,.., x m ) can be mapped to φ(x) = (φ 1 (x),..., φ M (x)). After mappng to the feature space, the new tranng set S = {(φ(x 1 ), t 1 ),..., (φ(x N ), t N )} s obtaned. To see ths clearly, consder the followng example. S 1 = {(a, 1), (b, 1), (c, 1), (d, 1)} where a = (0, 0), b = (1, 0), c = (0, 1) and d = (1, 1). S 1 cannot be lnearly separable wthout makng any error. 32

39 Now let φ(x) = x 2 1 2x1 x 2 where x = (x 1, x 2 ). By the feature functon φ(x), x 2 2 a = (0, 0) φ(a) = a = (0, 0, 0) b = (1, 0) φ(b) = b = (1, 0, 0) c = (0, 1) φ(c) = c = (0, 0, 1) d = (1, 1) φ(d) = d = (1, 2, 1) Frst, check that S 1 = {(a, 1), (b, 1), (c, 1), (d, 1)} s lnearly separable n R 3, so φ(x) can be seen. However, as mentoned earler, φ usually s unknown f the kernel functon s used. Let x = (x 1, x 2 ) and y = (y 1, y 2 ). Then defne K(x, y) = (x, y) 2 = x 2 1y x 1 y 1 x 2 y 2 + x 2 2y2. 2 We can easly check K(a, b) = φ(a)φ(b) K(c, d) = φ(c)φ(d). SVM uses a kernel functon that the value of the kernel functon s the same as the value of nner product n the feature space. Thus, usng new tranng set S and by (83), let s consder the followng optmzaton problem : (84) max 1 2 N N t t j λ λ j φ(x )φ(x j ) + =1 j=1 N =1 λ s.t N λ t = 0 =1 0 λ C = 1, 2,..., N Now, defne the kernel K(x, x j ) = φ(x )φ(x j ) for any nner product (x, x j ) n the orgnal space. Then (84) can be wrtten as the followng : (85) max 1 2 N N t t j λ λ j K(x, x j ) + =1 j=1 N =1 λ 33

40 N s.t λ t = 0 =1 0 λ C = 1, 2,..., N By the defnton of kernel, K(x, x j ) s always correspondng to the nner product n some feature space. The kernel K(x, x j ) can be computed wthout computng the mage φ(x ) and φ(x j ). (85) s called kernel substtuton. By usng kernel substtuton, nner product evaluatons can be computed n the orgnal space wthout knowng the feature space. Now, the followngs are popular kernel functons. 1. Polynomal kernels K(x, x j ) = [(x, x j ) + c] d where d s the degree of the polynomal and c s a constant. 2. Radal bass functon kernel K(x, x j ) = e x x j /2γ 2 Where γ s a parameter. 3. Sgmod kernel K(x, x j ) = tanh[α(x, x j ) + β] where α and β are parameters. 34

41 CHAPTER 4 SOLVING SUPPORT VECTOR MACHINES USING THE NEW ALGORITHM 4.1. Solvng SVM Problems Usng the New Algorthm Ths chapter wll show that Algorthm 2.1 can be used to solve SVM problems. Secton 4.1 wll show that solvng SVM s equvalent to fndng the mnmum dstance of two convex hulls. Fndng the maxmal margn of tranng data sets and fndng the mnmum dstance of two convex hulls gven by the tranng data sets seems to be dfferent processes. Yet, after reformulatng SVM-NV, SVM-NV can be recognzed as a NPP. Let {x } m =1 be a tranng data set, Let I be the ndex set for the frst class and J be the ndex set for the second class. Also, set U = {u u = β x, β = 1, β 0, I} and V = {v v = β j x j, β j = 1, β j 0, j J} Reformulaton of SVM-NV was done by Keerth[2]. SVM Dual s equvalent to (86) mn 1 β s β t t s t t (x s, x t ) 2 s t s.t β s 0, β = 1, β j = 1 I j J Now consder an NPP problem. In an NPP problem, u v 2 needs to be mnmzed. u v 2 = β x j β j x j 2 = ( β x j β j x j, β x j β j x j ) = ( β x, β x ) ( β x, j β j x j ) ( j β j x j, β x ) + ( j β j x j, j β j x j ) 35

42 (87) = β β (x, x ) β β j (x, x j ) j β β j (x, x j ) + j j β j β j (x j, x j ). Note that n (87), f the nner product of two vectors s from the same ndex set, then the sgn of the nner product s postve. Otherwse, t s negatve. (88) Thus, (87) can be wrtten as the followng: β s β t t s t t (x s, x t ) s t where t s t t = 1 f x s and x t are from the same ndex set t s t t = 1 The mnmum dstance between two convex hulls s nonzero f and only f two convex hulls do not ntersect. That s, f two convex hulls ntersect, then the mnmum dstance of two convex hulls s zero. As t s shown earler n chapter 3, real SVM problems have some volatons. If some x volates, then two convex hulls could ntersect. However, havng zeros for a mnmum dstance for two convex hulls s not desred. Here, Fress recommends usng a sum of squared volatons n the cost functon. mn 1 2 w 2 + C 2 k s 2 k s.t wx + b > 1 s j wx + b < 1 + s Ths problem s called SVM Sum of Squared Volaton (SVM-VQ). SVM-VQ can be converted to (89) by smple transformaton. To see ths, let s set and ( ( ) w w = ); b = b; x x = Cs 1, I and x j = e C ( xj ) 1, j J e j C where e s the m dmensonal vector n whch the th component s 1 and all are 0. Then 1 2 w 2 = 1 2 w 2 + C 2 w x + b = wx + b + s k s 2 k w x j + b = wx j + b s 36

43 Thus, SVM-VQ s equvalent to (89) mn 1 2 w 2 s.t w x + b > 1 w x j + b < 1 Here, feasble space of SVM-VQ s non-empty because of slack varable s, and the result of (89) s automatcally feasble. By settng, ( ) x x = 1, I and x j = e C ( xj ) 1, j J. the new algorthm can be used as usual. e j C 4.2. Stoppng Method for the Algorthm Ths secton wll show the stoppng method for NPP usng Algorthm 2.1. Ths secton wll use the KKT condton of optmzaton theory. Frst, consder the general Quadratc Programmng Problem. (Quadratc Programmng Problem) mn 1 2 xt Qx + cx + d s.t Ax b = 0 x 0. KKT for the Quadratc programmng s the followng L(x, λ, µ) = 1 2 xt Qx + cx + d λ (Ax b) µ x = 0 1. L(x,λ,µ) x = Qx + c λ t A µ = 0 2. λ (Ax b) = 0, µ t x = 0 3. λ 0, µ 0 By KKT condton, x 0 then (Qx + c λ t A) = 0. Ths means f nonzero ndex s known, then (Qx + c λ t A) = 0. Usng 2, the lnear system can be solved to solve the Quadratc programmng problem. 37

44 By (88), the NPP problem s equvalent to the followng Quadratc programmng problem. mn 1 2 at Aa where A = Xt X X t Y (NPP-QP) s.t = 2, a 0 Y t X Y t Y =1a Consder lagrangan for NPP-QP L(a, λ, u) = 1 2 at Aa λ(a t e 2) µ t a Then by KKT condton, 1. Aa λe µ = 0 2. λ(a t e 2) = 0 3. µ t a = 0 4. By 1, µ = Aa λe 5. By 3, f a 0, (Aa λe) = 0 6. By 3, f a = 0, (Aa λe) 0 Then by the above, f the nonzero ndces for a s known, only lnear system 2 and 5 need to be solved. Fnally, by usng theorem 4 n chapter 2, t s noted that ndces for u k and v k are not changed for large k. From ths, t can be assumed that u k and v k are on G and G respectvely. Once u k and v k are on G on G respectvely, nonzero ndces for a can be determned n the NPP-QP. After solvng lnear system 2 and 5, condton 6 must be checked to determne f the zero ndces are satsfed. 38

45 CHAPTER 5 EXPERIMENTS 5.1. Comparson wth Lnearly Separable Sets Ths secton wll compare performance between Algorthm 2.1 and sequental mnmal optmzaton(smo) usng lnearly separable sets. The MATLAB program was used to compare the performance. Frst, performance of Algorthm 2.1 and the MDM algorthm are compared n dmenson 100. By ncreasng the number of ponts, the results of experments can be seen n Fgure 5.1. Second, twenty non-ntersected pars of convex hulls n dmensons 100, 200, 300, 500, 700, 1000, 1200, 2000, and 2500 are randomly generated. Then the mnmum dstances between two convex hulls s computed. In each dmenson, the number of vectors s changed to see the performance of two algorthms. The followng graphs show CPU tmes for Algorthm 2.1 and SMO n lnear kernel wth coeffcent 0, lnear kernel wth coeffcent 1, polynomal kernel wth degree 3, and polynomal kernel wth degree 5. In the graphs, lnear kernel wth coeffcent 0 was denoted by Ln(0), lnear kernel wth coeffcent 1 was denoted by Ln(1), polynomal kernel wth degree 3 was denoted by P3, and polynomal kernel wth degree 5 was denoted by P5. Also n the graphs, for example the expresson ndcates that 7000 vectors are n dmenson 100. Fgures 5.2 to 5.12 llustrate the results for the experments n secton Comparson wth Real World Problems In ths secton, the performance between algorthm 2.1 and sequental mnmal optmzaton (SMO) s compared wth real world problems. Four kernel functons are used to compare the performance of Algorthm 2.1 and SMO. Those kernels are polynomal kernel wth degree 3, polynomal kernel wth degree 5, radcal bass wth gamma 1 and radcal bass wth gamma 10. Polynomal kernel wth degree 3, polynomal kernel wth degree 5, radcal bass 39

46 2.5 2 Graph of MDM Graph of Our Alg CPU tmes The number of ponts Fgure 5.1. Algorthm 2.1 and the MDM Algorthm wth gamma 1 and radcal bass wth gamma 10 are denoted d3,d5, g1 and g10 respectvely. Fgures 5.13 to 5.16 llustrate the results for the experments n secton

47 X Graph SMO Graph of Our Alg 0.04 CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

48 X Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

49 X500 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

50 2 1000X1000 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

51 X1200 Graph SMO Graph of Our Alg 2 CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

52 9 2000X2000 Graph SMO Graph of Our Alg 8 7 CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

53 X2500 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

54 Graph SMO Graph of Our Alg 7000X CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

55 X300 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

56 X500 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

57 X700 Graph SMO Graph of Our Alg CPU seconds Ln(0) Ln(1) P3 P5 Kernel Fgure Ponts n Dmenson

58 4 3.5 Heart(270X13) Graph SMO Graph of Our Alg CPU seconds P3 P5 g1 g10 Kernel Fgure Heart 52

59 2.5 Breast cancer(263x9) Graph SMO Graph of Our Alg 2 CPU seconds P3 P5 g1 g10 Kernel Fgure Breast Cancer 53

60 Ttanc(24X3) Graph SMO Graph of Our Alg CPU seconds P3 P5 g1 g10 Kernel Fgure Ttanc 54

61 2.5 Thyrod(215X5) Graph SMO Graph of Our Alg 2 CPU seconds P3 P5 g1 g10 Kernel Fgure Thyrod 55

62 CHAPTER 6 CONCLUSION AND FUTURE WORKS 6.1. Concluson As shown prevously n secton 1 of chapter 5, the performance of SMO s better than Algorthm 2.1 wth , , and The performance wth or wth hgher dmensons, Algorthm 2.1 works better than SMO. Algorthm 2.1 performs better for hgher dmensons. To show ths, fx the number of ponts and ncrease the dmenson from dmenson 100 to 300. In the dmensons of both 100 and 300 wth 7000 ponts respectvely, SMO performs faster than Algorthm 2.1. However, wth data set randomly generated by the MATLAB, CPU tme for Algorthm s shorter than SMO n the dmensons of both 500 and 700 wth 7000 ponts respectvely. Therefore, n randomly generated lnearly separable sets wth a hgher dmenson, Algorthm 2.1 performs better than SMO. Although many applcaton problems have a small number of support vectors, data sets that are generated by the MATLAB usually have a small number of support vectors. That s, f the data sets have a large number of support vectors, then Algorthm 2.1 may perform slower than SMO. Unfortunately, most of the real world data sets are n a low dmenson format. Snce the performance of SMO s better than Algorthm 2.1 wth lnearly separable sets, SMO performed better wth those data sets. But as shown n secton 2 n chapter 5, Algorthm 2.1 s stll comparable wth SMO. By the theorem 4 n chapter 2 and stoppng method for the algorthm n secton 2 n chapter 4, Algorthm 2.1 can fnd the soluton for the NPP n a fnte number of teratons Future Work In Algorthm 2.1, two coeffcents of u k n representaton at each teraton are updated, but note that, n fact, four or more coeffcents of u k can be updated. Instead of usng 56

63 u k+1 = u k + α k (x k x k ), u k+1 = u k + t k α k (x mn,av x max,av ) can be used. In the new update, 0 t k 1, α k = mn{α k, α 2, k}, x mn,av = x k +x 2, k, x 2 max,av = x k +x 2, k 2 where (x k, u k v k ) = mn(x, u k v k ), (x 2, k, u k v k ) = mn (x, u k v k ), (x k, u k v k ) = max (x, u k v k ) and (x 2, a >0 k, u k v k ) = max (x, u k v k ). Wth ths expresson, four a >0 anda α coeffcents n u k+1 are updated. The precse method of updatng four coeffcents at a tme wll be shown n Appendx B. When Algorthm 2.1 and the mproved verson of Algorthm 2.1 are compared, the number of teratons for the mproved verson of Algorthm 2.1 s always smaller than the one wth Algorthm 2.1. By usng the above method, CPU tme for Algorthm 2.1 s mproved. Moreover, mproved Algorthm 2.1 could perform better than SMO n lower dmensons. 57

64 u av. Appendx A Improved MDM. Defne u 2 = max α j >0,α j α (x, u) and u 2 = mn (x j, u) x j u and u av = u +u 2 and u 2 av = u+u 2 2 Ths secton wll show the mproved MDM algorthm by usng the average ponts u av and Improved MDM Step 1 Choose u k U and fnd u k and u k Step 2 Fnd u 2k and u 2k. If u 2k does not exst, then use the MDM. If (u 2k, u 2k ) (u k, u k ), then use the MDM. If (u 2k, u 2k ) > (u k, u k ), then use the MDM. If δ(u k ) > k (u k ) = (u avk u avk, u k ), then use the MDM. Step 3 Otherwse, set u k (t) = u k + tα k (u avk u avk ) where α = mn{α, α 2} and t [0, 1] α = mn{α, α 2} s used snce each coeffcent of α need to be greater than or equal to 0 n order to be n the convex hull. Step 4 Set u k+1 = u k + t k α k (u avk u avk ) where (u k(t k ), u k (t k )) = mn (u k (t), u k (t)) t [0,1] Step 5 Iterate untl k (u k ) = (u avk u avk, u k ) 0. In the representaton u k = m α k x, the mproved MDM s updatng 4 coeffcent of the =1 representaton meanwhle the MDM s updatng 2 coeffcent of the representaton. Ths makes the teraton faster. However, updatng two many coeffcent could make the teraton slow around the soluton u. By Step 4 u k+1 u k s obtaned. The proof of the convergence Assumpton 1 (1) u 2k exst (2) (u 2k, u 2k ) > (u k, u k ) (3) (u 2k, u 2k ) (u k, u k ) (4) δ(u k ) k (u k ) 58

65 Defne δ(u) = (u, u) (u av, u). Then clearly δ(u) 0.By Corollay 2 to Lemma 2 n [1], If {u k }, u k U, k = 0, 1, 2...s a sequence of ponts such that u k+1 u k and there s a subsequence { u kj } such that δ(ukj ) u = m =1 0 then u k u. Also, by the Assumpton 1 for any kj α x U, δ (u) (u). Lemma A.1 lm k α k k (u k ) = 0 where α k = mn{α, α 2} (proof) Let u k (t) = u k + 2tα k (u avk u k ). Then (u k (t), u k (t)) = (u k, u k ) 4tα k k (u k ) + (4α 2 )t 2 u avk u av 2. Suppose lm k α k k (u k ) 0. then there s a subsequnce u kj that α kj kj ɛ > 0. Then ( u kj (t), u kj (t) ) (u kj, u kj ) 4tɛ + 16t 2 d 2 where d = max x l x p. (u kj, u kj ) 4tɛ + 16t 2 d 2 s mnmzed at t = mn{ ε d 2, 1} If t = 1, then ε > d 2 t snce ε d 2 > t. Then by (u k (t), u k (t)) = (u k, u k ) 4tα k k (u k ) + (4α 2 )t 2 u avk u av 2 ( ukj (t), u kj (t) ) (u kj, u kj ) 4ε + 4ε = (u kj, u kj ) Ths s a contradcton. If t = ε d 2 then ( u kj (t), u kj (t) ) (u kj, u kj ) 4 ε d 2 ε + 4( ε d 2 ) 2 d 2 = (u kj, u kj ) Ths s a contradcton. Thus lm k α k k (u k ) = 0 A.2 lm k (u k ) = 0 k (proof) Suppose lm k (u k ) = # (u # ) > 0 k Then for large k > K, k (u k ) # (u # ) 2. By Lemma A.1 α k 0. Assume (u k (t), u k (t)) s mnmzed at t k = Hence, for large k > K K 1 t k = 1, thus u k+1 = u k + 2α k (u avk u av) Let { u kj } be subsequence such that kj (u kj ) # (u # ) k (u k ) 2α k u avk u av 2. In the sequence { u kj }, some terms can be omtted f necessary. The sequence that satsfes the followng 3 condtons hold can be found. (1) α,kj α # 59

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Chapter 6 Support vector machine. Séparateurs à vaste marge

Chapter 6 Support vector machine. Séparateurs à vaste marge Chapter 6 Support vector machne Séparateurs à vaste marge Méthode de classfcaton bnare par apprentssage Introdute par Vladmr Vapnk en 1995 Repose sur l exstence d un classfcateur lnéare Apprentssage supervsé

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Lagrange Multipliers Kernel Trick

Lagrange Multipliers Kernel Trick Lagrange Multplers Kernel Trck Ncholas Ruozz Unversty of Texas at Dallas Based roughly on the sldes of Davd Sontag General Optmzaton A mathematcal detour, we ll come back to SVMs soon! subject to: f x

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Support Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012

Support Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012 Support Vector Machnes Je Tang Knowledge Engneerng Group Department of Computer Scence and Technology Tsnghua Unversty 2012 1 Outlne What s a Support Vector Machne? Solvng SVMs Kernel Trcks 2 What s a

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Kristin P. Bennett. Rensselaer Polytechnic Institute

Kristin P. Bennett. Rensselaer Polytechnic Institute Support Vector Machnes and Other Kernel Methods Krstn P. Bennett Mathematcal Scences Department Rensselaer Polytechnc Insttute Support Vector Machnes (SVM) A methodology for nference based on Statstcal

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

CSE 252C: Computer Vision III

CSE 252C: Computer Vision III CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Nonlinear Classifiers II

Nonlinear Classifiers II Nonlnear Classfers II Nonlnear Classfers: Introducton Classfers Supervsed Classfers Lnear Classfers Perceptron Least Squares Methods Lnear Support Vector Machne Nonlnear Classfers Part I: Mult Layer Neural

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Formulas for the Determinant

Formulas for the Determinant page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

A 2D Bounded Linear Program (H,c) 2D Linear Programming

A 2D Bounded Linear Program (H,c) 2D Linear Programming A 2D Bounded Lnear Program (H,c) h 3 v h 8 h 5 c h 4 h h 6 h 7 h 2 2D Lnear Programmng C s a polygonal regon, the ntersecton of n halfplanes. (H, c) s nfeasble, as C s empty. Feasble regon C s unbounded

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function A Local Varatonal Problem of Second Order for a Class of Optmal Control Problems wth Nonsmooth Objectve Functon Alexander P. Afanasev Insttute for Informaton Transmsson Problems, Russan Academy of Scences,

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

A new Approach for Solving Linear Ordinary Differential Equations

A new Approach for Solving Linear Ordinary Differential Equations , ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of

More information

Min Cut, Fast Cut, Polynomial Identities

Min Cut, Fast Cut, Polynomial Identities Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning Advanced Introducton to Machne Learnng 10715, Fall 2014 The Kernel Trck, Reproducng Kernel Hlbert Space, and the Representer Theorem Erc Xng Lecture 6, September 24, 2014 Readng: Erc Xng @ CMU, 2014 1

More information

Lecture 5 Decoding Binary BCH Codes

Lecture 5 Decoding Binary BCH Codes Lecture 5 Decodng Bnary BCH Codes In ths class, we wll ntroduce dfferent methods for decodng BCH codes 51 Decodng the [15, 7, 5] 2 -BCH Code Consder the [15, 7, 5] 2 -code C we ntroduced n the last lecture

More information

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information