ECE 417 Lecture 6: knn and Linear Classifiers Amit Das 09/14/2017
Acknowledgments http://research.cs.tamu.edu/prism/lectures/pr/pr_l8.pdf http://classes.engr.oregonstate.edu/eecs/spring2012/cs534/notes/kn n.pdf
Nearest Neighbor Algorithm Remember all training examples Given a new example x, find the its closest training example <x i, y i > and predict y i New example How to measure distance Euclidean (squared): x x i 2 = ( x x j j i j ) 2
Decision Boundaries: The Voronoi Diagram Given a set of points, a Voronoi diagram describes the areas that are nearest to any given point. These areas can be viewed as zones of control.
Decision Boundaries: The Voronoi Diagram Decision boundaries are formed by a subset of the Voronoi diagram of the training data Each line segment is equidistant between two points of opposite class. The more examples that are p stored, the more fragmented and complex the decision boundaries can become.
Decision Boundaries With large number of examples and possible noise in the labels, the decision boundary can become nasty! We end up overfitting the data
Example: KNearest Neighbor K = 4 New example Find the k nearest neighbors and have them vote. Has a g smoothing effect. This is especially good when there is noise in the class labels.
knearest Neighbor: Probabilistic Interpretation Can interpret knn as Maximum Aposteriori (MAP) Classifier a.k.a. Baye s classifier. Posterior Probability = p(class example) knn uses MAP Rule:
Effect of K K=1 K=15 Figures from Hastie, Tibshirani and Friedman (Elements of Statistical Learning) Larger k produces smoother boundary effect and can reduce the impact of class label noise. But when K = N, we always predict the majority class
Example I knn in action Threeclass 2D problem with nonlinearly separable, multimodal likelihoods We use the knn rule (k = 5) and the Euclidean distance The resulting decision boundaries and decision regions are shown below CSCE 666 Pattern Analysis Ricardo GutierrezOsuna CSE@TAMU 10
Example II Twodim 3class problem with unimodal likelihoods with a common mean; these classes are also not linearly separable We used the knn rule (k = 5), and the Euclidean distance as a metric CSCE 666 Pattern Analysis Ricardo GutierrezOsuna CSE@TAMU 11
Question: how to choose k? Can we choose k to minimize the mistakes that we make on training examples (training error)? K=20 K=1 A model selection problem that we will Model complexity study later
Characteristics of the knn Classifier Advantages: Simple interpretation of Baye s classifier Can discover complex nonlinear boundaries between classes Easy to Implement (10 effective lines of Matlab code) Disadvantages: Lazy learning algorithm: It defers the learning until a test example is given. Thus, it doesn t learn from the training data. It actually memorizes all the data. Considers all training data before it can classify a test example Large storage and computation requirements. Susceptible to curse of dimensionality.
.. ECL 4 (7 &ye s CI ~ss;j,, " ~,c~;re lea,.. ~:,) f~ o.f> " " t ~ :!),, s{j"..' ":,o~ Ce. f of~ c.. ~.(' ~Air/,, rriaf. ' v~ +;_~ A f ~ (~s 00/11/2017 co ~tdtt$ J<'. tc fo!v '/ 1 I t; okmj'r4 /' L, Sotn.?if(\l > Y to,,} (0 c, l~ n~j / f+o H, / ( "'a J.;.~ f,a,_..;~j ) ( h~ t!rlm8>$ I n (_ O rr r<> l.v\'v' t'~ e,ri S)H.s.l;c.!!,) 'Ob9P~c,..) ' X e tr J.e.kc.fo,... ~ Jk.\,..;,+ 6 r rice X ' } I, r,,_t lo Y) t,,,11, te~ t!s ho!cl (L) ~J lo,_... ' '() ::, \ I ~r J1n.e / ' / ~)~ ~r~~ )( i,j,1 r '> ' {~v C ~~~ ~0J ', \tj h ~+. ~1.e. /P,}. Cr t_).1'hg ftvo \, it~ \\Pl\e t: r dorlln.l'i ( ~~ ~ X ~'(r(y:<' 1.> )\) ~ lo,,.s 0 ' ' ' j? f'/) I(\ 1fY', I t...e '
X : 0 i.c!."'~ ~ C,_' ".) y ~ /,,J,,_I I <IA,, (,._. V ) ' y = ( y ;,. f Ae~ic l,.j (,J,,J / " I ~ (,., " ) t. ~ )(. y ~ ; ( 1',~$, x) ) r y t y / c1..,, c1.,~) <~ y 7 l 0 A M I? /J.,0/1 LornmWM~CH'~ ~~ ta,..s., ~5,\;"c3, I r,r i \ 'f ~ 1. ~~c..l'"'l L,~o I \ I {::!f R C~t~tvl. I A~) f"nf.e_m,()/.._, ( +Jk ~ ~~)
Fw~ 1 RAASH :o 'fe ~ f' ( / f y) ' Z rry:ty '() / :::V \ / I /o I r ( ;* 'i) y f ( '/ ) r( y\p\fo)1(,::o) 1 71 fe 7'0 f'~,.. f Ar '?'1 1 ~ 1'0 "' \ 1 _ t) ~ 1' F" cf I f' 01 i h&/\ 001\ 0 bpj,; L '..$ J:;,J (_ s l 'f._ ',_;..,,..,.. ;,.,cl
~r's ~~ ' >.+k f'e... \ ( '( ~ 1 \ x) F \ 'I i \ " ) f (Fr) 1 ") \ ( t h ~ 1). f (xjfo),.., r::o I 'f ": \ ~ " 'f: D " / 1:. \. 1 < I > HAf~ (M~x ~M~ + ~e. wf,~)
A k. t). 4,,_, ~{ µ_t<> ~ \; k~ U..1> L (x) 1 H /yc9 t>(xf 't~o) [ (r.) = 'cj L(<), '\.,~ ' : <=7 fg):: (; "~(h ~h.r c.:,. tw1t'1~+ LfJ.JJ, cc F (." f ~=o \ = 1"" c\ P O f ) l;lel.'h~ ~ '7 { ~IA< 1_ t ( x \, : I ) ~ " "' N ( µ, a;; ),wd Vo '.:. 6, X "' ~( (,M 1 / o; ) /J,, 7 J1 0 ~Z ( ~. /Jo) e. 2 (T 2 ~ t x fa,) e. 2 0 ~ 2.. \ huj L ( "\ : f ( )( \ 1 = I) f('(/y,:o)
0A,)2 (0 e. ~ d7\(j z t(x µ,) (,,P~ 2 u 2 I r 'l. '1' ( ",4., r e. ~ 202.. ~ { ~7\U '2 e.2.. 0 'l \ L (, ') ~ \ ( (. ~ JJ l )J.~) 'I, ;::: er <~ 'f'c ~1 :: 1 ' I 0/
e) ( (, a, J,_. cfv'"'' n""'.f ~.,,,) C l""m'",j. 7 ~ """"+ ~) ~,= 0 X rj N ( flo, <1"o ) )( N r{ C }I ' cr, ) [(x) w,1\ l,,_a t~ f,..j,,.,~, /,. e,, X (R..,,.,,J l~h=~), i("lr) ~~.. hi~1,j.. ~,, y: () ; y:: \ ' X N r{ ( ~~/ ~r>) X N N ( )),,_, :f1 )!'~ t. ~~ z ~o k / k~~, Lt> A.
t'' C. ' "' rf ( }J, z" ) I= I "I("' M( µ, /,) 2,f, / )J,;;,Mo ib $(~) ~ + / ~:' z~)r (l:( fl.~z~) x T ~ [ r l JJ, z, JJ. µ: ~o J t ~ )oa f> A.n,rio \e~ 7 ().L_ ~(xj MA~d 'Fi'~""' be+ccfo1 (Q vojµ,~;c. Dt'..~ Ce u. t 1,.....(<4.,,(;.., 4\ cr,jj
C~«,e1... ) ~ = o c "H1... ) r:: '.# X N rr( µ 0 / <r) X,v N(µ er) 11 :::: t, ' c ~ cr,,,.i :: (J" ~o l_~ ~r Gl 2,. ( ~) ; ( µ \ tfa/,} ~.(: d,eca~ " ),)Jc'..,y Jc~)~ ~\,Al~ 'J I ;2.. I / X ~ C () >( < ( / "" 1\.,_
/ r~ : ti~ f A + 1', r,., ( "",JO 0). ' fe ( t) ~ ~ ~ ( ';_5) /0~ ) + fl 1 & ( M {v f., ( { y) i> Te ( t") < 1:>,, Ce) /
f'ect>t\ / _!,e,~;j:,,'j/ '' A C cwj). / fk re},,, VP p I r le> r1~.\,=.,/i\ j :~, +!,,, coo.vo!s.~ r) c rer,cl,.,,, 1 "', cit~; r? :. 1 c\),.fe +"e.: r 'P ;_ t~ se. +",_ 1 0 W it ~t f o,,h,.s :: T N f + \f' ::: 50 1~ ±1 ~ "'H'~ /' Tf' 1 PN ;;;.,oo
L~4 ~ ~v4_«e,}e 41( it,.. ~ merkics?a 1 r 2o Ff J ~r.j 2 O f 20 t.3v ~ s ( ~A "'1 Tyt~ l" e /i.j..br~) T=N 1= N 1 'Tr' lo to + o 0 O ( r ire,,,(!._ /,J./.Jf. ) ~ f1> TP TP..i F N ao 10 + (0 Q, CJ t rm ( ~ :::.. 12e ao.jj Se r,_s,.j..; v, ~ TP R. ) T.P 11 : 0 82 '. lr1 t"' CA~,J "',.J b V"' C I ~.,.,._?.1,,,,,,_,t/,s. (o.ffks)
f..50 5 f' ~o iore. 2. 1 ~ J fe~ fteci'8'>&r'l 2 10 <r + I 4 \ ~1... /2f /7 f \ J tf> A e. 0 ~ F,~ 1 (_ wo''1) ( w+) r \iz.~ f ~e r,;.si 111\. 1~,µ I ~faf~~ low V/\ ( v.v 1 RerlA\\ 'O 9,... J=.;~ ;11J. ~ C,Of\~.: I I '.> (A, ~~ d~t.,,,, r.o~ejva\ I
NAME: Exam 2 Page 3 Problem 1 (20 points) You want to classify zoo animals. Your zoo only has two species: elephants and giraffes. There are more elephants than giraffes: if Y is the species, p Y (elephant) = p Y (giraffe) = e e + 1 1 e + 1 where e = 2.718... is the base of the natural logarithm. The height of giraffes is Gaussian, with mean µ G = 5 meters and variance σg 2 = 1. The height of elephants is also Gaussian, with mean µ E = 3 and variance σe 2 = 1. Under these circumstances, the minimum probability of error classifier is { giraffe x > θ ŷ(x) = elephant x < θ Find the value of θ that minimizes the probability of error.
\/ ::: e r~f h~ :: o j;"~ffe = I e_ < f ( / c e I~ th~ ) F ( f: I te.. :.: <' I\ ~ ('~~) ( =. Ii<: I I yz O. 1 ~ \,. C \".,,,.;f.' 0 n fh.e. t, _:1) Q f }. X N N(::>/ )) X A) t( (5/,) ~ /.i ele r 1i is S? (t"~ t~ ~e.'3 ~ 15 g 'l I + '/ /,'4> f",r. t/\01 t ~J!h,. rn, fj fe,.,_,,._ M :.5 ~~: 3 r / CT~ \~ ~~I\ µ,.,4.,) )
t' In (e) ~ ;;) 5 1..3 2 @) I 1 \ 4 45 " 4 2 / 0/ Vb rq + {~J?e f ( i o ~ ( / Po) t ('_ er / '>( ~ 4.s 'I< < 45 E> Q, (4.5 3 ) ' I te.. I It,, ve 7, ~ (}.I. f I( I + ) t e rr :: (_,J( Q ( 5 =~,i.?) Q \f E'