CS407 Neural Computation
|
|
- Domenic Jefferson
- 6 years ago
- Views:
Transcription
1 CS407 Neural Computatio Lecture 4: Sigle Layer Perceptro SLP Classifiers Lecturer: A/Prof. M. Beamou
2 Outlie What s a SLP ad hat s classificatio? Limitatio of a sigle perceptro. Foudatios of classificatio ad Bayes Decisio makig theory Discrimiat fuctios, liear machie ad miimum distace classificatio Traiig ad classificatio usig the Discrete perceptro Sigle-Layer Cotiuous perceptro Netorks for liearly separable classificatios Appedix A: Ucostraied optimizatio techiques Appedix B: Perceptro Covergece proof Suggested readig ad refereces
3 What is a perceptro ad hat is a Sigle Layer Perceptro SLP?
4 Perceptro The simplest form of a eural etork cosists of a sigle euro ith adjustable syaptic eights ad bias performs patter classificatio ith oly to classes perceptro covergece theorem : Patters vectors are dra from to liearly separable classes Durig traiig, the perceptro algorithm coverges ad positios the decisio surface i the form of hyperplae betee to classes by adjustig syaptic eights
5 What is a perceptro? x x k k Bias b k v m x + k kj j j Activatio fuctio b y k k ϕ v k Σ v k ϕ. Output y k x m Iput sigal km Syaptic eights Summig juctio Discrete Perceptro: ϕ sig Cotious Perceptro: ϕ S shape
6 Activatio Fuctio of a perceptro + + v i v i - Sigum Fuctio sig Discrete Perceptro: ϕ sig Cotious Perceptro: ϕ v s shape
7 SLP Architecture Sigle layer perceptro Iput layer Output layer
8 Where are e headig? Differet No-Liearly Separable Problems Structure Types of Decisio Regios Exclusive-OR Problem Classes ith Most Geeral Meshed regiosregio Shapes Sigle-Layer Half Plae Bouded By Hyperplae A B B A B A To-Layer Covex Ope Or Closed Regios A B B A B A Three-Layer Arbitrary Complexity Limited by No. of Nodes A B B A B A
9 Revie from last lectures:
10 Implemetig Logic Gates ith Perceptros We ca use the perceptro to implemet the basic logic gates AND, OR ad NOT. All e eed to do is fid the appropriate coectio eights ad euro thresholds to produce the right outputs for each set of iputs. We sa ho e ca costruct simple etorks that perform NOT, AND, ad OR. It is the a ell ko result from logic that e ca costruct ay logical fuctio from these three operatios. The resultig etorks, hoever, ill usually have a much more complex architecture tha a simple Perceptro. We geerally at to avoid decomposig complex problems ito simple logic gates, by fidig the eights ad thresholds that ork directly i a Perceptro architecture.
11 Implemetatio of Logical NOT, AND, ad OR I each case e have iputs i i ad outputs out, ad eed to determie the eights ad thresholds. It is easy to fid solutios by ispectio:
12 The Need to Fid Weights Aalytically Costructig simple etorks by had is oe thig. But hat about harder problems? For example, hat about: Ho log do e keep lookig for a solutio? We eed to be able to calculate appropriate parameters rather tha lookig for solutios by trial ad error. Each traiig patter produces a liear iequality for the output i terms of the iputs ad the etork parameters. These ca be used to compute the eights ad thresholds.
13 Fidig Weights Aalytically for the AND Netork We have to eights ad ad the threshold θ, ad for each traiig patter e eed to satisfy So the traiig data lead to four iequalities: It is easy to see that there are a ifiite umber of solutios. Similarly, there are a ifiite umber of solutios for the NOT ad OR etorks.
14 Limitatios of Simple Perceptros We ca follo the same procedure for the XOR etork: Clearly the secod ad third iequalities are icompatible ith the fourth, so there is i fact o solutio. We eed more complex etorks, e.g. that combie together may simple etorks, or use differet activatio/thresholdig/trasfer fuctios. It the becomes much more difficult to determie all the eights ad thresholds by had. These eights istead are adapted usig learig rules. Hece, eed to cosider learig rules see previous lecture, ad more complex architectures.
15 E.g. Decisio Surface of a Perceptro + + x Liearly separable - - x x x No-Liearly separable Perceptro is able to represet some useful fuctios But fuctios that are ot liearly separable e.g. XOR are ot represetable
16 What is classificatio?
17 Classificatio? Patter classificatio/recogitio - Assig the iput data a physical object, evet, or pheomeo to oe of the pre-specified classes categories The block diagram of the recogitio ad classificatio system
18 Classificatio: a example Duda & Hart, Chapter Automate the process of sortig icomig fish o a coveyor belt accordig to species Salmo or Sea bass. Set up a camera Take some sample images Note the physical differeces betee the to types of fish Legth Lightess Width No. & shape of fis safirim Positio of the mouth
19 Classificatio a example
20 Classificatio: a example Cost of misclassificatio: depeds o applicatio Is it better to misclassify salmo as bass or vice versa? Put salmo i a ca of bass loose profit Put bass i a ca of salmo loose customer There is a cost associated ith our decisio. Make a decisio to miimize a give cost. Feature Extractio: Problem & Domai depedet Requires koledge of the domai A good feature extractor ould make the job of the classifier trivial.
21 Bayesia decisio theory
22 Bayesia Decisio Theory Duda & Hart, Chapter Bayesia decisio theory is a fudametal statistical approach to the problem of patter classificatio. Decisio makig he all the probabilistic iformatio is ko. For give probabilities the decisio is optimal. Whe e iformatio is added, it is assimilated i optimal fashio for improvemet of decisios.
23 Bayesia Decisio Theory Fish Example: Each fish is i oe of states: sea bass or salmo Let ω deote the state of ature ω ω for sea bass ω ω for salmo
24 Bayesia Decisio Theory The State of ature is upredictable ω is a variable that must be described probabilistically. If the catch produced as much salmo as sea bass the ext fish is equally likely to be sea bass or salmo. Defie Pω : a priori probability that the ext fish is sea bass Pω : a priori probability that the ext fish is salmo.
25 Bayesia Decisio Theory If other types of fish are irrelevat: P ω + P ω. Prior probabilities reflect our prior koledge e.g. time of year, fishig area, Simple decisio Rule: Make a decisio ithout seeig the fish. Decide if P ω > P ω ; ω otherise. OK if decidig for oe fish If several fish, all assiged to same class.
26 Bayesia Decisio Theory... I geeral, e ill have some features ad more iformatio. Feature: lightess measuremet x Differet fish yield differet lightess readigs x is a radom variable
27 Bayesia Decisio Theory. Defie px ω Class Coditioal Probability Desity Probability desity fuctio for x give that the state of ature is ω The differece betee px ω ad px ω describes the differece i lightess betee sea bass ad salmo.
28 Class coditioed probability desity: px ω Hypothetical class-coditioal probability Desity fuctios are ormalized area uder each curve is.0
29 Bayesia Decisio Theory... Suppose that e ko The prior probabilities Pω ad Pω, The coditioal desities ad Measure lightess of a fish x. pxω pxω p ω x What is the category of the fish? j
30 Bayes Formula Give Prior probabilities Pω j Coditioal probabilities px ω j Measuremet of particular item Feature value x Bayes formula: Likelihood Prior Posterior Evidece from p ω, x p x P P x p x j ω j ω j ω j here so Pω i x i P ω x p x p x ωi P ωi i j p x ω P ω j p x j
31 Bayes' formula... px ω j is called the likelihood of ω j ith respect to x. the ω j category for hich px ω j is large is more "likely" to be the true category px is the evidece ho frequetly e ill measure a patter ith feature value x. Scale factor that guaratees that the posterior probabilities sum to.
32 Posterior Probability Posterior probabilities for the particular priors Pω /3 ad Pω /3. At every x the posteriors sum to.
33 Error If e decide ω P ω x P error x If e decide ω P ω x For a give x, e ca miimize the probability of error by decidig ω if Pω x > Pω x ad ω otherise.
34 Bayes' Decisio Rule Miimizes the probability of error ω : if Pω x > Pω x i.e. ω : otherise or ω : if P x ω Pω > Px ω Pω ω : otherise ad PError x mi [Pω x, Pω x] x P x P ω ω ω ω < > Likelihood ratio ω ω ω ω ω ω ω ω ω ω ω ω P P x p x p P x p P x p < > < > Threshold
35 Decisio Boudaries Classificatio as divisio of feature space ito o-overlappig regios X, K x X, k X R such that x assiged to ω k Boudaries betee these regios are ko as decisio surfaces or decisio boudaries
36 Optimum decisio boudaries Criterio: miimize miss-classificatio Maximize correct-classificatio y probabilit posterior maximum.. x P x P k j e i P x p P x p k j if X x Classify j k j j k k k ω ω ω ω ω ω > >, R Here P X x P X x P correct P R k k k k R k k k ω ω ω
37 Discrimiat fuctios Discrimiat fuctios determie classificatio by compariso of their values: Classify j k g x k x > g x Optimum classificatio: based o posterior probability P ω k x Ay mootoe fuctio g may be applied ithout chagig the decisio boudaries X k j if e. g. g k g x k x g P ω k l P ω x k x
38 The To-Category Case Use discrimiat fuctios g ad g, ad assigig x to ω if g >g. Alterative: defie a sigle discrimiat fuctio gx g x - g x, decide ω if gx>0, otherise decide ω. To category case g x P ω x P ω x p x ω P ω g x l + l p x ω P ω
39 Summary Bayes approach: Estimate class-coditioed probability desity Combie ith prior class probability Determie posterior class probability Derive decisio boudaries Alterate approach implemeted by NN Estimate posterior probability directly i.e. determie decisio boudaries directly
40 DISCRIMINANT FUNCTIONS
41 Discrimiat Fuctios Determie the membership i a category by the classifier based o the compariso of R discrimiat fuctios g x, g x,, g R x Whe x is ithi the regio X k if g k x has the largest value Do ot mix betee dim of each I/P vector dim of feature space; P # of I/P vectors; ad R # of classes.
42 Discrimiat Fuctios
43 Discrimiat Fuctios
44 Discrimiat Fuctios
45 Discrimiat Fuctios
46 Discrimiat Fuctios
47 Liear Machie ad Miimum Distace Classificatio Fid the liear-form discrimiat fuctio for to class classificatio he the class prototypes are ko Example 3.: Select the decisio hyperplae that cotais the midpoit of the lie segmet coectig ceter poit of to classes
48 Liear Machie ad Miimum Distace Classificatio dichotomizer The dichotomizer s discrimiat fuctio gx: t
49 Liear Machie ad Miimum Distace Classificatio multiclass classificatio The liear-form discrimiat fuctios for multiclass classificatio There are up to RR-/ decisio hyperplaes for R pairise separable classes i.e. ext to or touchig aother
50 Liear Machie ad Miimum Distace Classificatio multiclass classificatio Liear machie or miimum-distace classifier Assume the class prototypes are ko for all classes Euclidea distace betee iput patter x ad the ceter of class i, X i : t
51 Liear Machie ad Miimum Distace Classificatio multiclass classificatio
52 Liear Machie ad Miimum Distace Classificatio P, P, P 3 are the cetres of gravity of the prototype poits, e eed to desig a miimum distace classifier. Usig the formulas from the previous slide, e get i Note: to fid S e eed to compute g -g
53 Liear Machie ad Miimum Distace Classificatio If R liear discrimiat fuctios exist for a set of patters such that g i x > g x j for x Class i, i,,..., R, j,,..., R, i j The classes are liearly separable.
54 Liear Machie ad Miimum Distace Classificatio Example:
55 Liear Machie ad Miimum Distace Classificatio Example
56 Liear Machie ad Miimum Distace Classificatio Examples 3. ad 3. have sho that the coefficiets eights of the liear discrimiat fuctios ca be determied if the a priori iformatio about the sets of patters ad their class membership is ko I the ext sectio Discrete perceptro e ill examie eural etorks that derive their eights durig the learig cycle.
57 Liear Machie ad Miimum Distace Classificatio The example of liearly o-separable patters
58 Liear Machie ad Miimum Distace Classificatio o sg x + x + Image space o Iput space x
59 Liear Machie ad Miimum Distace Classificatio o sg x + x + o sg x x + x x o o These iputs map to the same poit, i the image space
60 The Discrete Perceptro
61 Discrete Perceptro Traiig Algorithm So far, e have sho that coefficiets of liear discrimiat fuctios called eights ca be determied based o a priori iformatio about sets of patters ad their class membership. I hat follos, e ill begi to examie eural etork classifiers that derive their eights durig the learig cycle. The sample patter vectors x, x,, x p, called the traiig sequece, are preseted to the machie alog ith the correct respose.
62 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Zurada, Chapter 3 Itersects the origi poit 0 5 prototype patters i this case: y, y, y 5 If dim of augmeted patter vector is > 3, our poer of visualizatio are o loger of assistace. I this case, the oly recourse is to use the aalytical approach.
63 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Devise a aalytic approach based o the geometrical represetatios E.g. the decisio surface for the traiig patter y y i Class see previous slide Weight Space y i Class t y y If y i Class : If y i Class : + cy cy Gradiet the directio of steepest icrease c cotrols the size of adjustmet Weight Space c >0 is the correctio icremet is to times the learig costat ρ itroduced before correctio i egative gradiet directio
64 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios
65 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios t y cy y t y y p Note : pdistace so >0 Note : c is ot costat ad depeds o the curret traiig patter as expressed by eq. Above.
66 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios The iitial eight should be differet from 0. if 0, the cy 0 ad +cy0, therefore o possible adjustmets. For fixed correctio rule: ccostat, the correctio of eights is alays the same fixed portio of the curret traiig vector The eight ca be iitialised at ay value For dyamic correctio rule: c depeds o the distace from the eight i.e. the eight vector to the decisio surface i the eight space. Hece Curret eight Curret iput patter
67 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Dyamic correctio rule: Usig the value of c from previous slide as a referece, e devise a adjustmet techique hich depeds o the legth - λ: Symmetrical reflectio.r.t decisio plae λ0: No eight adjustmet Νote: λ is the ratio of the distace betee the old eight vector ad the e, to the distace from to the patter hyperplae
68 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Example: x x, x 3 0.5, 3, x 4 d, d 3 d d :class 4 The augmeted iput vectors are: y, 0.5, 3 :class y y3 y 4 The decisio lies t y i 0, for i,, 3, 4 are sketched o the augmeted eight space as follos:
69 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios
70 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios For c ad [.5.75] t Usig ' ± cy the eight traiig ith each step ca be summarized as follos: k c kt [ dk sg y k ] y k We obtai the folloig outputs ad eight updates: Step : Patter y is iput o d sg [.5 o + y.75].5.75
71 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Step : Patter y is iput ].5 [ sg 3 y o d o Step 3: Patter y 3 is iput ] [ sg y o d o
72 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Sice e have o evidece of correct classificatio of eight 4 the traiig set cosistig of a ordered sequece of patters y,y ad y 3 eeds to be recycled. We thus have y 4 y, y 5 y, etc the superscript is used to deote the folloig traiig step umber. Step 4, 5: o misclassificatio, thus o eight adjustmets. You ca check that the adjustmet folloig i steps 6 through 0 are as follos: 7 t.5.75 is i solutio area. 0 [ ] 9 8 [ ] t 7
73 The Cotiuous Perceptro
74 Cotiuous Perceptro Traiig Algorithm Replace the TLU Threshold Logic Uit ith the sigmoid activatio fuctio for to reasos: Gai fier cotrol over the traiig procedure Zurada, Chapter 3 Facilitate the differetial characteristics to eable computatio of the error gradiet of curret error fuctio The factor ½ does ot affect the locatio of the error miimum
75 Cotiuous Perceptro Traiig Algorithm The e eights is obtaied by movig i the directio of the egative gradiet alog the multidimesioal error surface By defiitio of the steepest descet cocept, each elemetary move should be perpedicular to the curret error cotour.
76 Cotiuous Perceptro Traiig Algorithm Defie the error as the squared differece betee the desired output ad the actual output et Sice et t y, e have yi i,,..., + i Traiig rule of cotious perceptro equivalet to delta traiig rule
77 Cotiuous Perceptro Traiig Algorithm
78 Cotiuous Perceptro Traiig Algorithm Same as previous example of discrete perceptro but ith a cotiuous activatio fuctio ad usig the delta rule. Same traiig patter set as discrete perceptro example
79 Cotiuous Perceptro Traiig Algorithm exp + k k k et d E λ [ ] exp + + E λ ad reducig the terms simplifies this expressio to the folloig form λ [ ] exp E + + similarly [ ] exp0.5 E + [ ] 3 exp3 E + + [ ] 4 exp E + These error surfaces are as sho o the previous slide.
80 Cotiuous Perceptro Traiig Algorithm miimum
81 Mutlicategory SLP
82 Multi-category Sigle layer Perceptro ets Treat the last fixed compoet of iput patter vector as the euro activatio threshold. T + y + - irrelevat heter it is equal to + or
83 Multi-category Sigle layer Perceptro ets R-category liear classifier usig R discrete bipolar perceptros Goal: The i-th TLU respose of + is idicative of class i ad all other TLU respod ith -
84 Multi-category Sigle layer Perceptro ets Example 3.5 should be t -, -, Idecisio regios regios here o class membership of a iput patter ca be uiquely determied based o the respose of the classifier patters i shaded areas are ot assiged ay reasoable classificatio. E.g. poit Q for hich o[ ] t > idecisive respose. Hoever o patters such as Q have bee used for traiig i the example.
85 Multi-category Sigle layer Perceptro ets [ ] [ ] t t 3 ad 0 3 [ ] t 0 ad For c Step : Patter y is iput [ ] [ ] [ ] * 0 3 sg 0 0 sg 0 0 sg Sice the oly icorrect respose is provided by TLU3, e have
86 Multi-category Sigle layer Perceptro ets Step : Patter y is iput [ ] [ ] [ ] sg 5 0 sg * 5 0 sg
87 Multi-category Sigle layer Perceptro ets Step 3: Patter y 3 is iput Oe ca verify that the oly adjusted eights from o o are those of TLU sg sg * sg y y y t t t Durig the secod cycle:
88 Multi-category Sigle layer Perceptro ets R-category liear classifier usig R cotiuous bipolar perceptros
89 Compariso betee Perceptro ad Bayes Classifier Perceptro operates o the promise that the patters to be classified are liear separable otherise the traiig algorithm ill oscillate, hile Bayes classifier ca ork o oseparable patters Bayes classifier miimizes the probability of misclassificatio hich is idepedet of the uderlyig distributio Bayes classifier is a liear classifier o the assumptio of Gaussiaity The perceptro is o-parametric, hile Bayes classifier is parametric its derivatio is cotiget o the assumptio of the uderlyig distributios The perceptro is adaptive ad simple to implemet the Bayes classifier could be made adaptive but at the expese of icreased storage ad more complex computatios
90 APPENDIX A Ucostraied Optimizatio Techiques
91 Ucostraied Optimizatio Techiques Hayki, Chapter 3 Cost fuctio E cotiuously differetiable a measure of ho to choose of a adaptive filterig algorithm so that it behaves i a optimum maer e at to fid a optimal solutio * that miimize E E r * 0 local iterative descet : startig ith a iitial guess deoted by 0, geerate a sequece of eight vectors,,, such that the cost fuctio E is reduced at each iteratio of the algorithm, as sho by E+ < E Steepest Descet, Neto s, Gauss-Neto s methods
92 Method of Steepest Descet Here the successive adjustmets applied to are i the directio of steepest descet, that is, i a directio opposite to the grade + - a g a : small positive costat called step size or learig-rate parameter. g : grade The method of steepest descet coverges to the optimal solutio * sloly The learig rate parameter a has a profoud ifluece o its covergece behavior overdamped, uderdamped, or eve ustablediverges
93 Neto s Method Usig a secod-order Taylor series expasio of the cost fuctio aroud the poit E E+ - E ~ gt + / T H here + -, H : Hessia matrix of E We at * that miimize E so differetiate E ith respect to : g + H * 0 so, * -H - g
94 Neto s Method Fially, H - g Neto s method coverges quickly asymptotically ad does ot exhibit the zigzaggig behavior the Hessia H has to be a positive defiite matrix for all
95 Gauss-Neto Method The Gauss-Neto method is applicable to a cost fuctio Because the error sigal ei is a fuctio of, e liearize the depedece of ei o by ritig Equivaletly, by usig matrix otatio e may rite i i e E r, ' i e i e i e T r r r r r r +, ' J e e r r r r r +
96 Gauss-Neto Method here J is the -by-m Jacobia matrix of e see bottom of this slide We at updated eight vector + defied by simple algebraic calculatio tells No differetiate this expressio ith respect to ad set the result to 0, e obtai +, ' arg mi e r r r r, ' J J J e e e T T T r r r r r r r r r r + + M M M e e e k e k e k e e e e J L L M L M L M L L M L M L M L L α α α
97 Gauss-Neto Method 0 + J J e J T T r r r Thus e get To guard agaist the possibility that the matrix product J T J is sigular, the customary practice is here is a small positive costat. This modificatio effect is progressively reduced as the umber of iteratios,, is icreased. δ e J J J T T r r r + e J I J J T T r r r + + δ
98 Liear Least-Squares Filter The sigle euro aroud hich it is built is liear The cost fuctio cosists of the sum of error squares Usig ad the error vector is Differetiatig ith respect to correspodigly, From Gauss-Neto method, eq. 3. i i i y T x i y i d i e X d e T e X X J T T d X d X X X + +
99 LMS Algorithm Based o the use of istataeous values for cost fuctio : Differetiatig ith respect to, The error sigal i LMS algorithm : hece, so, e E e e E d e T x e x e E x
100 LMS Algorithm E Usig as a estimate for the gradiet vector, gˆ x e Usig this for the gradiet vector of steepest descet method, LMS algorithm as follos : ˆ + ˆ + ηx e η : learig-rate parameter The iverse of η is a measure of the memory of the LMS algorithm Whe η is small, the adaptive process progress sloly, more of the past data are remembered ad a more accurate filterig actio
101 LMS Characteristics LMS algorithm produces a estimate of the eight vector Sacrifice a distictive feature Steepest descet algorithm : follos a ell-defied trajectory LMS algorithm : ˆ follos a radom trajectory Number of iteratios goes ifiity, performs a radom alk ˆ But importatly, LMS algorithm does ot require koledge of the statistics of the eviromet
102 Covergece Cosideratio To distict quatities, η ad x determie the covergece the user supplies η, ad the selectio of is importat for the LMS algorithm to coverge Covergece of the mea E [ ˆ ] 0 as This is ot a practical value Covergece i the mea square E [ ] e costat as x Covergece coditio for LMS algorithm i the mea square 0 <η < sum of mea -square values of the sesor iputs
103 APPENDIX B Perceptro Covergece Proof
104 Perceptro Covergece Proof Hayki, Chapter 3 Cosider the folloig perceptro: v m i 0 T i x i x T x > T x 0 for every iput vector x 0 for every iput vector x belogig to class C belogig to class C
105 Perceptro Covergece Proof The algorithm for the eight adjustmet for the perceptro if x is correctly classified o adjustmets to T + if x 0 ad x T + if x > 0 ad x otherise belogs belogs to class C to class C T + η x if x > 0 ad x T + + η x if x 0 ad x belogsto classc belogsto classc learig rate parameter η cotrols adjustmet applied to eight vector
106 Perceptro Covergece Proof For η ad 0 0 Suppose the perceptro icorrectly classifies the vectors x, x,... such that T x But siceη Sice 0 so that : + + x 0, iteratively e x + x x + η x for x belogig to C fid + B Sice the classes C ad C are assumed to be liearly separable, there exists a solutio 0 for hich T x>0 for the vectors x, x belogig to the subset H subset of traiig vectors that belog to class C.
107 Perceptro Covergece Proof For a fixed solutio 0, e may the defie a positive umber α as Hece α T mi 0 x x H equatio B aboveimplies B T T T T 0 + 0x + 0x x Usig equatio B above, sice each term is greater or equal tha α, e have T 0 + α No e use the Cauchy-Schartz iequality: a. b a a. b b a b for b or 0
108 Perceptro Covergece Proof This implies that: α + 0 B3 No let s follo aother developmet route otice idex k k + k + x k for k,..., ad xk By takig the squared Euclidea orm of both sides, e get: k + k + x k + T k x k But uder the assumptio the the perceptro icorrectly classifies a iput vector xk belogig to the subset H, e T have k x k < 0 ad hece: H k + k + x k
109 Perceptro Covergece Proof Or equivaletly, k k k k,... ; + x Addig these iequalities for k,, ad ivokig the iitial coditio 00, e get the folloig iequality: 4 B k k β + x Where β is a positive umber defied by; k H k k max x x β Eq. B4 states that the squared Euclidea orm of + gros at most liearly ith the umber of iteratios.
110 Perceptro Covergece Proof The secod result of B4 is clearly i coflict ith Eq. B3. Ideed, e ca state that caot be larger tha some value max for hich Eq. B3 ad B4 are both satisfied ith the equality sig. That is max is the solutio of the eq. max α 0 max Solvig for max give a solutio 0, e fid that max β β α 0 We have thus proved that for η for all, ad for 00, give that a sol vector 0 exists, the rule for adaptig the syaptic eights of the perceptro must termiate after at most max iteratios.
111 MORE READING
112 Suggested Readig. S. Hayki, Neural Netorks, Pretice-Hall, 999, chapter 3. L. Fausett, Fudametals of Neural Netorks, Pretice-Hall, 994, Chapter. R. O. Duda, P.E. Hart, ad D.G. Stork, Patter Classificatio, d editio, Wiley 00. Appedix A4, chapter, ad chapter 5. J.M. Zurada, Itroductio to Artificial Neural Systems, West Publishig Compay, 99, chapter 3.
113 Refereces: These lecture otes ere based o the refereces of the previous slide, ad the folloig refereces. Berli Che Lecture otes: Normal Uiversity, Taipei, Taia, ROC. Ehud Rivli, IIT: 3. Ji Hyug Kim, KAIST Computer Sciece Dept., CS679 Neural Netork lecture otes 4. Dr Joh A. Bulliaria, Course Material, Itroductio to Neural Netorks,
ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More information10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice
0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct
More informationPattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet
More informationJacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3
No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies
More informationPerceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10
Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)
More informationWeek 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :
ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationPattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationME 539, Fall 2008: Learning-Based Control
ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationIntroduction to Optimization Techniques. How to Solve Equations
Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually
More informationThe z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j
The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.
More informationExpectation-Maximization Algorithm.
Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................
More informationTHE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.
THE SOLUTION OF NONLINEAR EQUATIONS f( ) = 0. Noliear Equatio Solvers Bracketig. Graphical. Aalytical Ope Methods Bisectio False Positio (Regula-Falsi) Fied poit iteratio Newto Raphso Secat The root of
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationDiscrete-Time Systems, LTI Systems, and Discrete-Time Convolution
EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [
More informationPattern Classification, Ch4 (Part 1)
Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher
More informationCHAPTER 10 INFINITE SEQUENCES AND SERIES
CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationLECTURE 17: Linear Discriminant Functions
LECURE 7: Liear Discrimiat Fuctios Perceptro leari Miimum squared error (MSE) solutio Least-mea squares (LMS) rule Ho-Kashyap procedure Itroductio to Patter Aalysis Ricardo Gutierrez-Osua exas A&M Uiversity
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationMarkov Decision Processes
Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes
More informationMath 113 Exam 3 Practice
Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This
More informationTEACHER CERTIFICATION STUDY GUIDE
COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationOn Random Line Segments in the Unit Square
O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,
More informationPrinciple Of Superposition
ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationMathematical Foundations -1- Sets and Sequences. Sets and Sequences
Mathematical Foudatios -1- Sets ad Sequeces Sets ad Sequeces Methods of proof 2 Sets ad vectors 13 Plaes ad hyperplaes 18 Liearly idepedet vectors, vector spaces 2 Covex combiatios of vectors 21 eighborhoods,
More informationChapter 7: The z-transform. Chih-Wei Liu
Chapter 7: The -Trasform Chih-Wei Liu Outlie Itroductio The -Trasform Properties of the Regio of Covergece Properties of the -Trasform Iversio of the -Trasform The Trasfer Fuctio Causality ad Stability
More informationOptimization Methods MIT 2.098/6.255/ Final exam
Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short
More information3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,
3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [
More informationSeptember 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1
September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright
More informationClustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.
Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.
More informationFMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu
FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More information6.003 Homework #3 Solutions
6.00 Homework # Solutios Problems. Complex umbers a. Evaluate the real ad imagiary parts of j j. π/ Real part = Imagiary part = 0 e Euler s formula says that j = e jπ/, so jπ/ j π/ j j = e = e. Thus the
More informationWe are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n
Review of Power Series, Power Series Solutios A power series i x - a is a ifiite series of the form c (x a) =c +c (x a)+(x a) +... We also call this a power series cetered at a. Ex. (x+) is cetered at
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.
6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio
More informationProblem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =
Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,
More informationPAPER : IIT-JAM 2010
MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure
More informationDifferentiable Convex Functions
Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationCS537. Numerical Analysis and Computing
CS57 Numerical Aalysis ad Computig Lecture Locatig Roots o Equatios Proessor Ju Zhag Departmet o Computer Sciece Uiversity o Ketucky Leigto KY 456-6 Jauary 9 9 What is the Root May physical system ca be
More informationMa 530 Introduction to Power Series
Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationAxis Aligned Ellipsoid
Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationChapter 2 The Solution of Numerical Algebraic and Transcendental Equations
Chapter The Solutio of Numerical Algebraic ad Trascedetal Equatios Itroductio I this chapter we shall discuss some umerical methods for solvig algebraic ad trascedetal equatios. The equatio f( is said
More informationMath 257: Finite difference methods
Math 257: Fiite differece methods 1 Fiite Differeces Remember the defiitio of a derivative f f(x + ) f(x) (x) = lim 0 Also recall Taylor s formula: (1) f(x + ) = f(x) + f (x) + 2 f (x) + 3 f (3) (x) +...
More informationComplex Analysis Spring 2001 Homework I Solution
Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle
More informationMultilayer perceptrons
Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer
More informationGeneralized Semi- Markov Processes (GSMP)
Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual
More informationThe axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c.
5.4 Applicatio of Perturbatio Methods to the Dispersio Model for Tubular Reactors The axial dispersio model for tubular reactors at steady state ca be described by the followig equatios: d c Pe dz z =
More informationCS321. Numerical Analysis and Computing
CS Numerical Aalysis ad Computig Lecture Locatig Roots o Equatios Proessor Ju Zhag Departmet o Computer Sciece Uiversity o Ketucky Leigto KY 456-6 September 8 5 What is the Root May physical system ca
More informationFall 2013 MTH431/531 Real analysis Section Notes
Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationSubject: Differential Equations & Mathematical Modeling -III. Lesson: Power series solutions of Differential Equations. about ordinary points
Power series solutio of Differetial equatios about ordiary poits Subject: Differetial Equatios & Mathematical Modelig -III Lesso: Power series solutios of Differetial Equatios about ordiary poits Lesso
More informationNotes on iteration and Newton s method. Iteration
Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f
More informationNaïve Bayes. Naïve Bayes
Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationPattern Classification
Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher
More informationSequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationSection 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations
Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?
More informationChapter 7 z-transform
Chapter 7 -Trasform Itroductio Trasform Uilateral Trasform Properties Uilateral Trasform Iversio of Uilateral Trasform Determiig the Frequecy Respose from Poles ad Zeros Itroductio Role i Discrete-Time
More informationMAT1026 Calculus II Basic Convergence Tests for Series
MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationMost text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t
Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More information10.2 Infinite Series Contemporary Calculus 1
10. Ifiite Series Cotemporary Calculus 1 10. INFINITE SERIES Our goal i this sectio is to add together the umbers i a sequece. Sice it would take a very log time to add together the ifiite umber of umbers,
More informationΩ ). Then the following inequality takes place:
Lecture 8 Lemma 5. Let f : R R be a cotiuously differetiable covex fuctio. Choose a costat δ > ad cosider the subset Ωδ = { R f δ } R. Let Ωδ ad assume that f < δ, i.e., is ot o the boudary of f = δ, i.e.,
More informationCourse Outline. Designing Control Systems. Proportional Controller. Amme 3500 : System Dynamics and Control. Root Locus. Dr. Stefan B.
Amme 3500 : System Dyamics ad Cotrol Root Locus Course Outlie Week Date Cotet Assigmet Notes Mar Itroductio 8 Mar Frequecy Domai Modellig 3 5 Mar Trasiet Performace ad the s-plae 4 Mar Block Diagrams Assig
More informationINF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification
INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio
More informationLecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods
Itroductory lecture otes o Partial Differetial Equatios - c Athoy Peirce. Not to be copied, used, or revised without explicit writte permissio from the copyright ower. 1 Lecture 8: Solvig the Heat, Laplace
More informationCALCULUS BASIC SUMMER REVIEW
CALCULUS BASIC SUMMER REVIEW NAME rise y y y Slope of a o vertical lie: m ru Poit Slope Equatio: y y m( ) The slope is m ad a poit o your lie is, ). ( y Slope-Itercept Equatio: y m b slope= m y-itercept=
More information6.883: Online Methods in Machine Learning Alexander Rakhlin
6.883: Olie Methods i Machie Learig Alexader Rakhli LECURE 4 his lecture is partly based o chapters 4-5 i [SSBD4]. Let us o give a variat of SGD for strogly covex fuctios. Algorithm SGD for strogly covex
More informationAnalytic Continuation
Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationSubject: Differential Equations & Mathematical Modeling-III
Power Series Solutios of Differetial Equatios about Sigular poits Subject: Differetial Equatios & Mathematical Modelig-III Lesso: Power series solutios of differetial equatios about Sigular poits Lesso
More informationMath 113 Exam 4 Practice
Math Exam 4 Practice Exam 4 will cover.-.. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for
More informationsubject to A 1 x + A 2 y b x j 0, j = 1,,n 1 y j = 0 or 1, j = 1,,n 2
Additioal Brach ad Boud Algorithms 0-1 Mixed-Iteger Liear Programmig The brach ad boud algorithm described i the previous sectios ca be used to solve virtually all optimizatio problems cotaiig iteger variables,
More informationAssignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1
Assigmet : Real Numbers, Sequeces. Let A be a o-empty subset of R ad α R. Show that α = supa if ad oly if α is ot a upper boud of A but α + is a upper boud of A for every N. 2. Let y (, ) ad x (, ). Evaluate
More informationU8L1: Sec Equations of Lines in R 2
MCVU U8L: Sec. 8.9. Equatios of Lies i R Review of Equatios of a Straight Lie (-D) Cosider the lie passig through A (-,) with slope, as show i the diagram below. I poit slope form, the equatio of the lie
More informationDefinitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.
Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,
More informationChapter 7. Support Vector Machine
Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)
More informationMa 530 Infinite Series I
Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li
More informationPattern Classification
Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher
More informationPC5215 Numerical Recipes with Applications - Review Problems
PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,
More information6.867 Machine learning
6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples
More information1 Review and Overview
CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we
More information