CS407 Neural Computation

Size: px
Start display at page:

Download "CS407 Neural Computation"

Transcription

1 CS407 Neural Computatio Lecture 4: Sigle Layer Perceptro SLP Classifiers Lecturer: A/Prof. M. Beamou

2 Outlie What s a SLP ad hat s classificatio? Limitatio of a sigle perceptro. Foudatios of classificatio ad Bayes Decisio makig theory Discrimiat fuctios, liear machie ad miimum distace classificatio Traiig ad classificatio usig the Discrete perceptro Sigle-Layer Cotiuous perceptro Netorks for liearly separable classificatios Appedix A: Ucostraied optimizatio techiques Appedix B: Perceptro Covergece proof Suggested readig ad refereces

3 What is a perceptro ad hat is a Sigle Layer Perceptro SLP?

4 Perceptro The simplest form of a eural etork cosists of a sigle euro ith adjustable syaptic eights ad bias performs patter classificatio ith oly to classes perceptro covergece theorem : Patters vectors are dra from to liearly separable classes Durig traiig, the perceptro algorithm coverges ad positios the decisio surface i the form of hyperplae betee to classes by adjustig syaptic eights

5 What is a perceptro? x x k k Bias b k v m x + k kj j j Activatio fuctio b y k k ϕ v k Σ v k ϕ. Output y k x m Iput sigal km Syaptic eights Summig juctio Discrete Perceptro: ϕ sig Cotious Perceptro: ϕ S shape

6 Activatio Fuctio of a perceptro + + v i v i - Sigum Fuctio sig Discrete Perceptro: ϕ sig Cotious Perceptro: ϕ v s shape

7 SLP Architecture Sigle layer perceptro Iput layer Output layer

8 Where are e headig? Differet No-Liearly Separable Problems Structure Types of Decisio Regios Exclusive-OR Problem Classes ith Most Geeral Meshed regiosregio Shapes Sigle-Layer Half Plae Bouded By Hyperplae A B B A B A To-Layer Covex Ope Or Closed Regios A B B A B A Three-Layer Arbitrary Complexity Limited by No. of Nodes A B B A B A

9 Revie from last lectures:

10 Implemetig Logic Gates ith Perceptros We ca use the perceptro to implemet the basic logic gates AND, OR ad NOT. All e eed to do is fid the appropriate coectio eights ad euro thresholds to produce the right outputs for each set of iputs. We sa ho e ca costruct simple etorks that perform NOT, AND, ad OR. It is the a ell ko result from logic that e ca costruct ay logical fuctio from these three operatios. The resultig etorks, hoever, ill usually have a much more complex architecture tha a simple Perceptro. We geerally at to avoid decomposig complex problems ito simple logic gates, by fidig the eights ad thresholds that ork directly i a Perceptro architecture.

11 Implemetatio of Logical NOT, AND, ad OR I each case e have iputs i i ad outputs out, ad eed to determie the eights ad thresholds. It is easy to fid solutios by ispectio:

12 The Need to Fid Weights Aalytically Costructig simple etorks by had is oe thig. But hat about harder problems? For example, hat about: Ho log do e keep lookig for a solutio? We eed to be able to calculate appropriate parameters rather tha lookig for solutios by trial ad error. Each traiig patter produces a liear iequality for the output i terms of the iputs ad the etork parameters. These ca be used to compute the eights ad thresholds.

13 Fidig Weights Aalytically for the AND Netork We have to eights ad ad the threshold θ, ad for each traiig patter e eed to satisfy So the traiig data lead to four iequalities: It is easy to see that there are a ifiite umber of solutios. Similarly, there are a ifiite umber of solutios for the NOT ad OR etorks.

14 Limitatios of Simple Perceptros We ca follo the same procedure for the XOR etork: Clearly the secod ad third iequalities are icompatible ith the fourth, so there is i fact o solutio. We eed more complex etorks, e.g. that combie together may simple etorks, or use differet activatio/thresholdig/trasfer fuctios. It the becomes much more difficult to determie all the eights ad thresholds by had. These eights istead are adapted usig learig rules. Hece, eed to cosider learig rules see previous lecture, ad more complex architectures.

15 E.g. Decisio Surface of a Perceptro + + x Liearly separable - - x x x No-Liearly separable Perceptro is able to represet some useful fuctios But fuctios that are ot liearly separable e.g. XOR are ot represetable

16 What is classificatio?

17 Classificatio? Patter classificatio/recogitio - Assig the iput data a physical object, evet, or pheomeo to oe of the pre-specified classes categories The block diagram of the recogitio ad classificatio system

18 Classificatio: a example Duda & Hart, Chapter Automate the process of sortig icomig fish o a coveyor belt accordig to species Salmo or Sea bass. Set up a camera Take some sample images Note the physical differeces betee the to types of fish Legth Lightess Width No. & shape of fis safirim Positio of the mouth

19 Classificatio a example

20 Classificatio: a example Cost of misclassificatio: depeds o applicatio Is it better to misclassify salmo as bass or vice versa? Put salmo i a ca of bass loose profit Put bass i a ca of salmo loose customer There is a cost associated ith our decisio. Make a decisio to miimize a give cost. Feature Extractio: Problem & Domai depedet Requires koledge of the domai A good feature extractor ould make the job of the classifier trivial.

21 Bayesia decisio theory

22 Bayesia Decisio Theory Duda & Hart, Chapter Bayesia decisio theory is a fudametal statistical approach to the problem of patter classificatio. Decisio makig he all the probabilistic iformatio is ko. For give probabilities the decisio is optimal. Whe e iformatio is added, it is assimilated i optimal fashio for improvemet of decisios.

23 Bayesia Decisio Theory Fish Example: Each fish is i oe of states: sea bass or salmo Let ω deote the state of ature ω ω for sea bass ω ω for salmo

24 Bayesia Decisio Theory The State of ature is upredictable ω is a variable that must be described probabilistically. If the catch produced as much salmo as sea bass the ext fish is equally likely to be sea bass or salmo. Defie Pω : a priori probability that the ext fish is sea bass Pω : a priori probability that the ext fish is salmo.

25 Bayesia Decisio Theory If other types of fish are irrelevat: P ω + P ω. Prior probabilities reflect our prior koledge e.g. time of year, fishig area, Simple decisio Rule: Make a decisio ithout seeig the fish. Decide if P ω > P ω ; ω otherise. OK if decidig for oe fish If several fish, all assiged to same class.

26 Bayesia Decisio Theory... I geeral, e ill have some features ad more iformatio. Feature: lightess measuremet x Differet fish yield differet lightess readigs x is a radom variable

27 Bayesia Decisio Theory. Defie px ω Class Coditioal Probability Desity Probability desity fuctio for x give that the state of ature is ω The differece betee px ω ad px ω describes the differece i lightess betee sea bass ad salmo.

28 Class coditioed probability desity: px ω Hypothetical class-coditioal probability Desity fuctios are ormalized area uder each curve is.0

29 Bayesia Decisio Theory... Suppose that e ko The prior probabilities Pω ad Pω, The coditioal desities ad Measure lightess of a fish x. pxω pxω p ω x What is the category of the fish? j

30 Bayes Formula Give Prior probabilities Pω j Coditioal probabilities px ω j Measuremet of particular item Feature value x Bayes formula: Likelihood Prior Posterior Evidece from p ω, x p x P P x p x j ω j ω j ω j here so Pω i x i P ω x p x p x ωi P ωi i j p x ω P ω j p x j

31 Bayes' formula... px ω j is called the likelihood of ω j ith respect to x. the ω j category for hich px ω j is large is more "likely" to be the true category px is the evidece ho frequetly e ill measure a patter ith feature value x. Scale factor that guaratees that the posterior probabilities sum to.

32 Posterior Probability Posterior probabilities for the particular priors Pω /3 ad Pω /3. At every x the posteriors sum to.

33 Error If e decide ω P ω x P error x If e decide ω P ω x For a give x, e ca miimize the probability of error by decidig ω if Pω x > Pω x ad ω otherise.

34 Bayes' Decisio Rule Miimizes the probability of error ω : if Pω x > Pω x i.e. ω : otherise or ω : if P x ω Pω > Px ω Pω ω : otherise ad PError x mi [Pω x, Pω x] x P x P ω ω ω ω < > Likelihood ratio ω ω ω ω ω ω ω ω ω ω ω ω P P x p x p P x p P x p < > < > Threshold

35 Decisio Boudaries Classificatio as divisio of feature space ito o-overlappig regios X, K x X, k X R such that x assiged to ω k Boudaries betee these regios are ko as decisio surfaces or decisio boudaries

36 Optimum decisio boudaries Criterio: miimize miss-classificatio Maximize correct-classificatio y probabilit posterior maximum.. x P x P k j e i P x p P x p k j if X x Classify j k j j k k k ω ω ω ω ω ω > >, R Here P X x P X x P correct P R k k k k R k k k ω ω ω

37 Discrimiat fuctios Discrimiat fuctios determie classificatio by compariso of their values: Classify j k g x k x > g x Optimum classificatio: based o posterior probability P ω k x Ay mootoe fuctio g may be applied ithout chagig the decisio boudaries X k j if e. g. g k g x k x g P ω k l P ω x k x

38 The To-Category Case Use discrimiat fuctios g ad g, ad assigig x to ω if g >g. Alterative: defie a sigle discrimiat fuctio gx g x - g x, decide ω if gx>0, otherise decide ω. To category case g x P ω x P ω x p x ω P ω g x l + l p x ω P ω

39 Summary Bayes approach: Estimate class-coditioed probability desity Combie ith prior class probability Determie posterior class probability Derive decisio boudaries Alterate approach implemeted by NN Estimate posterior probability directly i.e. determie decisio boudaries directly

40 DISCRIMINANT FUNCTIONS

41 Discrimiat Fuctios Determie the membership i a category by the classifier based o the compariso of R discrimiat fuctios g x, g x,, g R x Whe x is ithi the regio X k if g k x has the largest value Do ot mix betee dim of each I/P vector dim of feature space; P # of I/P vectors; ad R # of classes.

42 Discrimiat Fuctios

43 Discrimiat Fuctios

44 Discrimiat Fuctios

45 Discrimiat Fuctios

46 Discrimiat Fuctios

47 Liear Machie ad Miimum Distace Classificatio Fid the liear-form discrimiat fuctio for to class classificatio he the class prototypes are ko Example 3.: Select the decisio hyperplae that cotais the midpoit of the lie segmet coectig ceter poit of to classes

48 Liear Machie ad Miimum Distace Classificatio dichotomizer The dichotomizer s discrimiat fuctio gx: t

49 Liear Machie ad Miimum Distace Classificatio multiclass classificatio The liear-form discrimiat fuctios for multiclass classificatio There are up to RR-/ decisio hyperplaes for R pairise separable classes i.e. ext to or touchig aother

50 Liear Machie ad Miimum Distace Classificatio multiclass classificatio Liear machie or miimum-distace classifier Assume the class prototypes are ko for all classes Euclidea distace betee iput patter x ad the ceter of class i, X i : t

51 Liear Machie ad Miimum Distace Classificatio multiclass classificatio

52 Liear Machie ad Miimum Distace Classificatio P, P, P 3 are the cetres of gravity of the prototype poits, e eed to desig a miimum distace classifier. Usig the formulas from the previous slide, e get i Note: to fid S e eed to compute g -g

53 Liear Machie ad Miimum Distace Classificatio If R liear discrimiat fuctios exist for a set of patters such that g i x > g x j for x Class i, i,,..., R, j,,..., R, i j The classes are liearly separable.

54 Liear Machie ad Miimum Distace Classificatio Example:

55 Liear Machie ad Miimum Distace Classificatio Example

56 Liear Machie ad Miimum Distace Classificatio Examples 3. ad 3. have sho that the coefficiets eights of the liear discrimiat fuctios ca be determied if the a priori iformatio about the sets of patters ad their class membership is ko I the ext sectio Discrete perceptro e ill examie eural etorks that derive their eights durig the learig cycle.

57 Liear Machie ad Miimum Distace Classificatio The example of liearly o-separable patters

58 Liear Machie ad Miimum Distace Classificatio o sg x + x + Image space o Iput space x

59 Liear Machie ad Miimum Distace Classificatio o sg x + x + o sg x x + x x o o These iputs map to the same poit, i the image space

60 The Discrete Perceptro

61 Discrete Perceptro Traiig Algorithm So far, e have sho that coefficiets of liear discrimiat fuctios called eights ca be determied based o a priori iformatio about sets of patters ad their class membership. I hat follos, e ill begi to examie eural etork classifiers that derive their eights durig the learig cycle. The sample patter vectors x, x,, x p, called the traiig sequece, are preseted to the machie alog ith the correct respose.

62 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Zurada, Chapter 3 Itersects the origi poit 0 5 prototype patters i this case: y, y, y 5 If dim of augmeted patter vector is > 3, our poer of visualizatio are o loger of assistace. I this case, the oly recourse is to use the aalytical approach.

63 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Devise a aalytic approach based o the geometrical represetatios E.g. the decisio surface for the traiig patter y y i Class see previous slide Weight Space y i Class t y y If y i Class : If y i Class : + cy cy Gradiet the directio of steepest icrease c cotrols the size of adjustmet Weight Space c >0 is the correctio icremet is to times the learig costat ρ itroduced before correctio i egative gradiet directio

64 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios

65 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios t y cy y t y y p Note : pdistace so >0 Note : c is ot costat ad depeds o the curret traiig patter as expressed by eq. Above.

66 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios The iitial eight should be differet from 0. if 0, the cy 0 ad +cy0, therefore o possible adjustmets. For fixed correctio rule: ccostat, the correctio of eights is alays the same fixed portio of the curret traiig vector The eight ca be iitialised at ay value For dyamic correctio rule: c depeds o the distace from the eight i.e. the eight vector to the decisio surface i the eight space. Hece Curret eight Curret iput patter

67 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Dyamic correctio rule: Usig the value of c from previous slide as a referece, e devise a adjustmet techique hich depeds o the legth - λ: Symmetrical reflectio.r.t decisio plae λ0: No eight adjustmet Νote: λ is the ratio of the distace betee the old eight vector ad the e, to the distace from to the patter hyperplae

68 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Example: x x, x 3 0.5, 3, x 4 d, d 3 d d :class 4 The augmeted iput vectors are: y, 0.5, 3 :class y y3 y 4 The decisio lies t y i 0, for i,, 3, 4 are sketched o the augmeted eight space as follos:

69 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios

70 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios For c ad [.5.75] t Usig ' ± cy the eight traiig ith each step ca be summarized as follos: k c kt [ dk sg y k ] y k We obtai the folloig outputs ad eight updates: Step : Patter y is iput o d sg [.5 o + y.75].5.75

71 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Step : Patter y is iput ].5 [ sg 3 y o d o Step 3: Patter y 3 is iput ] [ sg y o d o

72 Discrete Perceptro Traiig Algorithm - Geometrical Represetatios Sice e have o evidece of correct classificatio of eight 4 the traiig set cosistig of a ordered sequece of patters y,y ad y 3 eeds to be recycled. We thus have y 4 y, y 5 y, etc the superscript is used to deote the folloig traiig step umber. Step 4, 5: o misclassificatio, thus o eight adjustmets. You ca check that the adjustmet folloig i steps 6 through 0 are as follos: 7 t.5.75 is i solutio area. 0 [ ] 9 8 [ ] t 7

73 The Cotiuous Perceptro

74 Cotiuous Perceptro Traiig Algorithm Replace the TLU Threshold Logic Uit ith the sigmoid activatio fuctio for to reasos: Gai fier cotrol over the traiig procedure Zurada, Chapter 3 Facilitate the differetial characteristics to eable computatio of the error gradiet of curret error fuctio The factor ½ does ot affect the locatio of the error miimum

75 Cotiuous Perceptro Traiig Algorithm The e eights is obtaied by movig i the directio of the egative gradiet alog the multidimesioal error surface By defiitio of the steepest descet cocept, each elemetary move should be perpedicular to the curret error cotour.

76 Cotiuous Perceptro Traiig Algorithm Defie the error as the squared differece betee the desired output ad the actual output et Sice et t y, e have yi i,,..., + i Traiig rule of cotious perceptro equivalet to delta traiig rule

77 Cotiuous Perceptro Traiig Algorithm

78 Cotiuous Perceptro Traiig Algorithm Same as previous example of discrete perceptro but ith a cotiuous activatio fuctio ad usig the delta rule. Same traiig patter set as discrete perceptro example

79 Cotiuous Perceptro Traiig Algorithm exp + k k k et d E λ [ ] exp + + E λ ad reducig the terms simplifies this expressio to the folloig form λ [ ] exp E + + similarly [ ] exp0.5 E + [ ] 3 exp3 E + + [ ] 4 exp E + These error surfaces are as sho o the previous slide.

80 Cotiuous Perceptro Traiig Algorithm miimum

81 Mutlicategory SLP

82 Multi-category Sigle layer Perceptro ets Treat the last fixed compoet of iput patter vector as the euro activatio threshold. T + y + - irrelevat heter it is equal to + or

83 Multi-category Sigle layer Perceptro ets R-category liear classifier usig R discrete bipolar perceptros Goal: The i-th TLU respose of + is idicative of class i ad all other TLU respod ith -

84 Multi-category Sigle layer Perceptro ets Example 3.5 should be t -, -, Idecisio regios regios here o class membership of a iput patter ca be uiquely determied based o the respose of the classifier patters i shaded areas are ot assiged ay reasoable classificatio. E.g. poit Q for hich o[ ] t > idecisive respose. Hoever o patters such as Q have bee used for traiig i the example.

85 Multi-category Sigle layer Perceptro ets [ ] [ ] t t 3 ad 0 3 [ ] t 0 ad For c Step : Patter y is iput [ ] [ ] [ ] * 0 3 sg 0 0 sg 0 0 sg Sice the oly icorrect respose is provided by TLU3, e have

86 Multi-category Sigle layer Perceptro ets Step : Patter y is iput [ ] [ ] [ ] sg 5 0 sg * 5 0 sg

87 Multi-category Sigle layer Perceptro ets Step 3: Patter y 3 is iput Oe ca verify that the oly adjusted eights from o o are those of TLU sg sg * sg y y y t t t Durig the secod cycle:

88 Multi-category Sigle layer Perceptro ets R-category liear classifier usig R cotiuous bipolar perceptros

89 Compariso betee Perceptro ad Bayes Classifier Perceptro operates o the promise that the patters to be classified are liear separable otherise the traiig algorithm ill oscillate, hile Bayes classifier ca ork o oseparable patters Bayes classifier miimizes the probability of misclassificatio hich is idepedet of the uderlyig distributio Bayes classifier is a liear classifier o the assumptio of Gaussiaity The perceptro is o-parametric, hile Bayes classifier is parametric its derivatio is cotiget o the assumptio of the uderlyig distributios The perceptro is adaptive ad simple to implemet the Bayes classifier could be made adaptive but at the expese of icreased storage ad more complex computatios

90 APPENDIX A Ucostraied Optimizatio Techiques

91 Ucostraied Optimizatio Techiques Hayki, Chapter 3 Cost fuctio E cotiuously differetiable a measure of ho to choose of a adaptive filterig algorithm so that it behaves i a optimum maer e at to fid a optimal solutio * that miimize E E r * 0 local iterative descet : startig ith a iitial guess deoted by 0, geerate a sequece of eight vectors,,, such that the cost fuctio E is reduced at each iteratio of the algorithm, as sho by E+ < E Steepest Descet, Neto s, Gauss-Neto s methods

92 Method of Steepest Descet Here the successive adjustmets applied to are i the directio of steepest descet, that is, i a directio opposite to the grade + - a g a : small positive costat called step size or learig-rate parameter. g : grade The method of steepest descet coverges to the optimal solutio * sloly The learig rate parameter a has a profoud ifluece o its covergece behavior overdamped, uderdamped, or eve ustablediverges

93 Neto s Method Usig a secod-order Taylor series expasio of the cost fuctio aroud the poit E E+ - E ~ gt + / T H here + -, H : Hessia matrix of E We at * that miimize E so differetiate E ith respect to : g + H * 0 so, * -H - g

94 Neto s Method Fially, H - g Neto s method coverges quickly asymptotically ad does ot exhibit the zigzaggig behavior the Hessia H has to be a positive defiite matrix for all

95 Gauss-Neto Method The Gauss-Neto method is applicable to a cost fuctio Because the error sigal ei is a fuctio of, e liearize the depedece of ei o by ritig Equivaletly, by usig matrix otatio e may rite i i e E r, ' i e i e i e T r r r r r r +, ' J e e r r r r r +

96 Gauss-Neto Method here J is the -by-m Jacobia matrix of e see bottom of this slide We at updated eight vector + defied by simple algebraic calculatio tells No differetiate this expressio ith respect to ad set the result to 0, e obtai +, ' arg mi e r r r r, ' J J J e e e T T T r r r r r r r r r r + + M M M e e e k e k e k e e e e J L L M L M L M L L M L M L M L L α α α

97 Gauss-Neto Method 0 + J J e J T T r r r Thus e get To guard agaist the possibility that the matrix product J T J is sigular, the customary practice is here is a small positive costat. This modificatio effect is progressively reduced as the umber of iteratios,, is icreased. δ e J J J T T r r r + e J I J J T T r r r + + δ

98 Liear Least-Squares Filter The sigle euro aroud hich it is built is liear The cost fuctio cosists of the sum of error squares Usig ad the error vector is Differetiatig ith respect to correspodigly, From Gauss-Neto method, eq. 3. i i i y T x i y i d i e X d e T e X X J T T d X d X X X + +

99 LMS Algorithm Based o the use of istataeous values for cost fuctio : Differetiatig ith respect to, The error sigal i LMS algorithm : hece, so, e E e e E d e T x e x e E x

100 LMS Algorithm E Usig as a estimate for the gradiet vector, gˆ x e Usig this for the gradiet vector of steepest descet method, LMS algorithm as follos : ˆ + ˆ + ηx e η : learig-rate parameter The iverse of η is a measure of the memory of the LMS algorithm Whe η is small, the adaptive process progress sloly, more of the past data are remembered ad a more accurate filterig actio

101 LMS Characteristics LMS algorithm produces a estimate of the eight vector Sacrifice a distictive feature Steepest descet algorithm : follos a ell-defied trajectory LMS algorithm : ˆ follos a radom trajectory Number of iteratios goes ifiity, performs a radom alk ˆ But importatly, LMS algorithm does ot require koledge of the statistics of the eviromet

102 Covergece Cosideratio To distict quatities, η ad x determie the covergece the user supplies η, ad the selectio of is importat for the LMS algorithm to coverge Covergece of the mea E [ ˆ ] 0 as This is ot a practical value Covergece i the mea square E [ ] e costat as x Covergece coditio for LMS algorithm i the mea square 0 <η < sum of mea -square values of the sesor iputs

103 APPENDIX B Perceptro Covergece Proof

104 Perceptro Covergece Proof Hayki, Chapter 3 Cosider the folloig perceptro: v m i 0 T i x i x T x > T x 0 for every iput vector x 0 for every iput vector x belogig to class C belogig to class C

105 Perceptro Covergece Proof The algorithm for the eight adjustmet for the perceptro if x is correctly classified o adjustmets to T + if x 0 ad x T + if x > 0 ad x otherise belogs belogs to class C to class C T + η x if x > 0 ad x T + + η x if x 0 ad x belogsto classc belogsto classc learig rate parameter η cotrols adjustmet applied to eight vector

106 Perceptro Covergece Proof For η ad 0 0 Suppose the perceptro icorrectly classifies the vectors x, x,... such that T x But siceη Sice 0 so that : + + x 0, iteratively e x + x x + η x for x belogig to C fid + B Sice the classes C ad C are assumed to be liearly separable, there exists a solutio 0 for hich T x>0 for the vectors x, x belogig to the subset H subset of traiig vectors that belog to class C.

107 Perceptro Covergece Proof For a fixed solutio 0, e may the defie a positive umber α as Hece α T mi 0 x x H equatio B aboveimplies B T T T T 0 + 0x + 0x x Usig equatio B above, sice each term is greater or equal tha α, e have T 0 + α No e use the Cauchy-Schartz iequality: a. b a a. b b a b for b or 0

108 Perceptro Covergece Proof This implies that: α + 0 B3 No let s follo aother developmet route otice idex k k + k + x k for k,..., ad xk By takig the squared Euclidea orm of both sides, e get: k + k + x k + T k x k But uder the assumptio the the perceptro icorrectly classifies a iput vector xk belogig to the subset H, e T have k x k < 0 ad hece: H k + k + x k

109 Perceptro Covergece Proof Or equivaletly, k k k k,... ; + x Addig these iequalities for k,, ad ivokig the iitial coditio 00, e get the folloig iequality: 4 B k k β + x Where β is a positive umber defied by; k H k k max x x β Eq. B4 states that the squared Euclidea orm of + gros at most liearly ith the umber of iteratios.

110 Perceptro Covergece Proof The secod result of B4 is clearly i coflict ith Eq. B3. Ideed, e ca state that caot be larger tha some value max for hich Eq. B3 ad B4 are both satisfied ith the equality sig. That is max is the solutio of the eq. max α 0 max Solvig for max give a solutio 0, e fid that max β β α 0 We have thus proved that for η for all, ad for 00, give that a sol vector 0 exists, the rule for adaptig the syaptic eights of the perceptro must termiate after at most max iteratios.

111 MORE READING

112 Suggested Readig. S. Hayki, Neural Netorks, Pretice-Hall, 999, chapter 3. L. Fausett, Fudametals of Neural Netorks, Pretice-Hall, 994, Chapter. R. O. Duda, P.E. Hart, ad D.G. Stork, Patter Classificatio, d editio, Wiley 00. Appedix A4, chapter, ad chapter 5. J.M. Zurada, Itroductio to Artificial Neural Systems, West Publishig Compay, 99, chapter 3.

113 Refereces: These lecture otes ere based o the refereces of the previous slide, ad the folloig refereces. Berli Che Lecture otes: Normal Uiversity, Taipei, Taia, ROC. Ehud Rivli, IIT: 3. Ji Hyug Kim, KAIST Computer Sciece Dept., CS679 Neural Netork lecture otes 4. Dr Joh A. Bulliaria, Course Material, Itroductio to Neural Netorks,

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10 Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)

More information

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading : ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

ME 539, Fall 2008: Learning-Based Control

ME 539, Fall 2008: Learning-Based Control ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0. THE SOLUTION OF NONLINEAR EQUATIONS f( ) = 0. Noliear Equatio Solvers Bracketig. Graphical. Aalytical Ope Methods Bisectio False Positio (Regula-Falsi) Fied poit iteratio Newto Raphso Secat The root of

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

LECTURE 17: Linear Discriminant Functions

LECTURE 17: Linear Discriminant Functions LECURE 7: Liear Discrimiat Fuctios Perceptro leari Miimum squared error (MSE) solutio Least-mea squares (LMS) rule Ho-Kashyap procedure Itroductio to Patter Aalysis Ricardo Gutierrez-Osua exas A&M Uiversity

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Mathematical Foundations -1- Sets and Sequences. Sets and Sequences

Mathematical Foundations -1- Sets and Sequences. Sets and Sequences Mathematical Foudatios -1- Sets ad Sequeces Sets ad Sequeces Methods of proof 2 Sets ad vectors 13 Plaes ad hyperplaes 18 Liearly idepedet vectors, vector spaces 2 Covex combiatios of vectors 21 eighborhoods,

More information

Chapter 7: The z-transform. Chih-Wei Liu

Chapter 7: The z-transform. Chih-Wei Liu Chapter 7: The -Trasform Chih-Wei Liu Outlie Itroductio The -Trasform Properties of the Regio of Covergece Properties of the -Trasform Iversio of the -Trasform The Trasfer Fuctio Causality ad Stability

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

6.003 Homework #3 Solutions

6.003 Homework #3 Solutions 6.00 Homework # Solutios Problems. Complex umbers a. Evaluate the real ad imagiary parts of j j. π/ Real part = Imagiary part = 0 e Euler s formula says that j = e jπ/, so jπ/ j π/ j j = e = e. Thus the

More information

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n Review of Power Series, Power Series Solutios A power series i x - a is a ifiite series of the form c (x a) =c +c (x a)+(x a) +... We also call this a power series cetered at a. Ex. (x+) is cetered at

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

PAPER : IIT-JAM 2010

PAPER : IIT-JAM 2010 MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

CS537. Numerical Analysis and Computing

CS537. Numerical Analysis and Computing CS57 Numerical Aalysis ad Computig Lecture Locatig Roots o Equatios Proessor Ju Zhag Departmet o Computer Sciece Uiversity o Ketucky Leigto KY 456-6 Jauary 9 9 What is the Root May physical system ca be

More information

Ma 530 Introduction to Power Series

Ma 530 Introduction to Power Series Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations Chapter The Solutio of Numerical Algebraic ad Trascedetal Equatios Itroductio I this chapter we shall discuss some umerical methods for solvig algebraic ad trascedetal equatios. The equatio f( is said

More information

Math 257: Finite difference methods

Math 257: Finite difference methods Math 257: Fiite differece methods 1 Fiite Differeces Remember the defiitio of a derivative f f(x + ) f(x) (x) = lim 0 Also recall Taylor s formula: (1) f(x + ) = f(x) + f (x) + 2 f (x) + 3 f (3) (x) +...

More information

Complex Analysis Spring 2001 Homework I Solution

Complex Analysis Spring 2001 Homework I Solution Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle

More information

Multilayer perceptrons

Multilayer perceptrons Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

The axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c.

The axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c. 5.4 Applicatio of Perturbatio Methods to the Dispersio Model for Tubular Reactors The axial dispersio model for tubular reactors at steady state ca be described by the followig equatios: d c Pe dz z =

More information

CS321. Numerical Analysis and Computing

CS321. Numerical Analysis and Computing CS Numerical Aalysis ad Computig Lecture Locatig Roots o Equatios Proessor Ju Zhag Departmet o Computer Sciece Uiversity o Ketucky Leigto KY 456-6 September 8 5 What is the Root May physical system ca

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Subject: Differential Equations & Mathematical Modeling -III. Lesson: Power series solutions of Differential Equations. about ordinary points

Subject: Differential Equations & Mathematical Modeling -III. Lesson: Power series solutions of Differential Equations. about ordinary points Power series solutio of Differetial equatios about ordiary poits Subject: Differetial Equatios & Mathematical Modelig -III Lesso: Power series solutios of Differetial Equatios about ordiary poits Lesso

More information

Notes on iteration and Newton s method. Iteration

Notes on iteration and Newton s method. Iteration Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Pattern Classification

Pattern Classification Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Chapter 7 z-transform

Chapter 7 z-transform Chapter 7 -Trasform Itroductio Trasform Uilateral Trasform Properties Uilateral Trasform Iversio of Uilateral Trasform Determiig the Frequecy Respose from Poles ad Zeros Itroductio Role i Discrete-Time

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

10.2 Infinite Series Contemporary Calculus 1

10.2 Infinite Series Contemporary Calculus 1 10. Ifiite Series Cotemporary Calculus 1 10. INFINITE SERIES Our goal i this sectio is to add together the umbers i a sequece. Sice it would take a very log time to add together the ifiite umber of umbers,

More information

Ω ). Then the following inequality takes place:

Ω ). Then the following inequality takes place: Lecture 8 Lemma 5. Let f : R R be a cotiuously differetiable covex fuctio. Choose a costat δ > ad cosider the subset Ωδ = { R f δ } R. Let Ωδ ad assume that f < δ, i.e., is ot o the boudary of f = δ, i.e.,

More information

Course Outline. Designing Control Systems. Proportional Controller. Amme 3500 : System Dynamics and Control. Root Locus. Dr. Stefan B.

Course Outline. Designing Control Systems. Proportional Controller. Amme 3500 : System Dynamics and Control. Root Locus. Dr. Stefan B. Amme 3500 : System Dyamics ad Cotrol Root Locus Course Outlie Week Date Cotet Assigmet Notes Mar Itroductio 8 Mar Frequecy Domai Modellig 3 5 Mar Trasiet Performace ad the s-plae 4 Mar Block Diagrams Assig

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio

More information

Lecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods

Lecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods Itroductory lecture otes o Partial Differetial Equatios - c Athoy Peirce. Not to be copied, used, or revised without explicit writte permissio from the copyright ower. 1 Lecture 8: Solvig the Heat, Laplace

More information

CALCULUS BASIC SUMMER REVIEW

CALCULUS BASIC SUMMER REVIEW CALCULUS BASIC SUMMER REVIEW NAME rise y y y Slope of a o vertical lie: m ru Poit Slope Equatio: y y m( ) The slope is m ad a poit o your lie is, ). ( y Slope-Itercept Equatio: y m b slope= m y-itercept=

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECURE 4 his lecture is partly based o chapters 4-5 i [SSBD4]. Let us o give a variat of SGD for strogly covex fuctios. Algorithm SGD for strogly covex

More information

Analytic Continuation

Analytic Continuation Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Subject: Differential Equations & Mathematical Modeling-III

Subject: Differential Equations & Mathematical Modeling-III Power Series Solutios of Differetial Equatios about Sigular poits Subject: Differetial Equatios & Mathematical Modelig-III Lesso: Power series solutios of differetial equatios about Sigular poits Lesso

More information

Math 113 Exam 4 Practice

Math 113 Exam 4 Practice Math Exam 4 Practice Exam 4 will cover.-.. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for

More information

subject to A 1 x + A 2 y b x j 0, j = 1,,n 1 y j = 0 or 1, j = 1,,n 2

subject to A 1 x + A 2 y b x j 0, j = 1,,n 1 y j = 0 or 1, j = 1,,n 2 Additioal Brach ad Boud Algorithms 0-1 Mixed-Iteger Liear Programmig The brach ad boud algorithm described i the previous sectios ca be used to solve virtually all optimizatio problems cotaiig iteger variables,

More information

Assignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1

Assignment 1 : Real Numbers, Sequences. for n 1. Show that (x n ) converges. Further, by observing that x n+2 + x n+1 Assigmet : Real Numbers, Sequeces. Let A be a o-empty subset of R ad α R. Show that α = supa if ad oly if α is ot a upper boud of A but α + is a upper boud of A for every N. 2. Let y (, ) ad x (, ). Evaluate

More information

U8L1: Sec Equations of Lines in R 2

U8L1: Sec Equations of Lines in R 2 MCVU U8L: Sec. 8.9. Equatios of Lies i R Review of Equatios of a Straight Lie (-D) Cosider the lie passig through A (-,) with slope, as show i the diagram below. I poit slope form, the equatio of the lie

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

Ma 530 Infinite Series I

Ma 530 Infinite Series I Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

More information

Pattern Classification

Pattern Classification Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher

More information

PC5215 Numerical Recipes with Applications - Review Problems

PC5215 Numerical Recipes with Applications - Review Problems PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information