ECE 417 Lecture 6: knn and Linear Classifiers. Amit Das 09/14/2017

Similar documents
A L A BA M A L A W R E V IE W

~,. :'lr. H ~ j. l' ", ...,~l. 0 '" ~ bl '!; 1'1. :<! f'~.., I,," r: t,... r':l G. t r,. 1'1 [<, ."" f'" 1n. t.1 ~- n I'>' 1:1 , I. <1 ~'..

P a g e 5 1 of R e p o r t P B 4 / 0 9

r(j) -::::.- --X U.;,..;...-h_D_Vl_5_ :;;2.. Name: ~s'~o--=-i Class; Date: ID: A

L11: Pattern recognition principles

Mining Classification Knowledge

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

I-1. rei. o & A ;l{ o v(l) o t. e 6rf, \o. afl. 6rt {'il l'i. S o S S. l"l. \o a S lrh S \ S s l'l {a ra \o r' tn $ ra S \ S SG{ $ao. \ S l"l. \ (?

Mining Classification Knowledge

176 5 t h Fl oo r. 337 P o ly me r Ma te ri al s

o V fc rt* rh '.L i.p -Z :. -* & , -. \, _ * / r s* / / ' J / X - -

K E L LY T H O M P S O N

ECE430 Name 5 () ( '-'1-+/~ Or"- f w.s. Section: (Circle One) 10 MWF 12:30 TuTh (Sauer) (Liu) TOTAL: USEFUL INFORMATION

I M P O R T A N T S A F E T Y I N S T R U C T I O N S W h e n u s i n g t h i s e l e c t r o n i c d e v i c e, b a s i c p r e c a u t i o n s s h o

A Gentle Introduction to Gradient Boosting. Cheng Li College of Computer and Information Science Northeastern University

rhtre PAID U.S. POSTAGE Can't attend? Pass this on to a friend. Cleveland, Ohio Permit No. 799 First Class

wo(..lr.i L"'J b]. tr+ +&..) i'> 't\uow).,...,. (o.. J \,} --ti \(' m'\...\,,.q.)).

APPH 4200 Physics of Fluids

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

APPH 4200 Physics of Fluids

,\ I. . <- c}. " C:-)' ) I- p od--- -;::: 'J.--- d, cl cr -- I. ( I) Cl c,\. c. 1\'0\ ~ '~O'-_. e ~.\~\S

Software Process Models there are many process model s in th e li t e ra t u re, s om e a r e prescriptions and some are descriptions you need to mode

l [ L&U DOK. SENTER Denne rapport tilhører Returneres etter bruk Dokument: Arkiv: Arkivstykke/Ref: ARKAS OO.S Merknad: CP0205V Plassering:

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang /

Last Name. First Name. General Instructions: (i) Use scratch paper at back of exam to work out answers; final answers must

o C *$ go ! b», S AT? g (i * ^ fc fa fa U - S 8 += C fl o.2h 2 fl 'fl O ' 0> fl l-h cvo *, &! 5 a o3 a; O g 02 QJ 01 fls g! r«'-fl O fl s- ccco

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

I I. R E L A T E D W O R K

::::l<r/ L- 1-1>(=-ft\ii--r(~1J~:::: Fo. l. AG -=(0,.2,L}> M - &-c ==- < ) I) ~..-.::.1 ( \ I 0. /:rf!:,-t- f1c =- <I _,, -2...


APPENDIX F WATER USE SUMMARY

Exhibit 2-9/30/15 Invoice Filing Page 1841 of Page 3660 Docket No

An Introduction to Statistical and Probabilistic Linear Models

35H MPa Hydraulic Cylinder 3.5 MPa Hydraulic Cylinder 35H-3

c. What is the average rate of change of f on the interval [, ]? Answer: d. What is a local minimum value of f? Answer: 5 e. On what interval(s) is f

SOLUTION SET. Chapter 8 LASER OSCILLATION ABOVE THRESHOLD "LASER FUNDAMENTALS" Second Edition

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Executive Committee and Officers ( )

Keep this exam closed until you are told to begin.

'NOTAS"CRITICAS PARA UNA TEDRIA DE M BUROCRACIA ESTATAL * Oscar Oszlak

5 s. 00 S aaaog. 3s a o. gg pq ficfi^pq. So c o. H «o3 g gpq ^fi^ s 03 co -*«10 eo 5^ - 3 d s3.s. as fe«jo. Table of General Ordinances.

o ri fr \ jr~ ^: *^ vlj o^ f?: ** s;: 2 * i i H-: 2 ~ " ^ o ^ * n 9 C"r * ^, ' 1 ^5' , > ^ t g- S T ^ r. L o n * * S* * w 3 ** ~ 4O O.

p r * < & *'& ' 6 y S & S f \ ) <» d «~ * c t U * p c ^ 6 *

> DC < D CO LU > Z> CJ LU

SOUTHWESTERN ELECTRIC POWER COMPANY SCHEDULE H-6.1b NUCLEAR UNIT OUTAGE DATA. For the Test Year Ended March 31, 2009

Machine Learning, Fall 2012 Homework 2

I n t e r n a t i o n a l E l e c t r o n i c J o u r n a l o f E l e m e n t a r y E.7 d u, c ai ts is ou n e, 1 V3 1o-2 l6, I n t h i s a r t

MPM 2D Final Exam Prep 2, June b) Y = 2(x + 1)2-18. ~..: 2. (xl- 1:'}")( t J') -' ( B. vi::: 2 ~ 1-'+ 4 1<. -t-:2 -( 6! '.

,.*Hffi;;* SONAI, IUERCANTII,N I,IMITDII REGD- 0FFICE: 105/33, VARDHMAN GotD[N PLNLA,R0AD No.44, pitampura, DELHI *ffigfk"

Using the Rational Root Theorem to Find Real and Imaginary Roots Real roots can be one of two types: ra...-\; 0 or - l (- - ONLl --

~---~ ~----~ ~~

Tausend Und Eine Nacht

Boosting. 1 Boosting. 2 AdaBoost. 2.1 Intuition

PHYS 232 QUIZ minutes.

Integrated II: Unit 2 Study Guide 2. Find the value of s. (s - 2) 2 = 200. ~ :-!:[Uost. ~-~::~~n. '!JJori. s: ~ &:Ll()J~

SOLUTION SET. Chapter 9 REQUIREMENTS FOR OBTAINING POPULATION INVERSIONS "LASER FUNDAMENTALS" Second Edition. By William T.

n r t d n :4 T P bl D n, l d t z d th tr t. r pd l

106 70/140H-8 70/140H-8

Dedicated to his friend H. A. M-ELLON I. CAP~ (s.s.memphis.). NEW ORLEANS, LOUIS GRUNEWALD. GRUNEWALD HALL.

8. Relax and do well.

z E z *" I»! HI UJ LU Q t i G < Q UJ > UJ >- C/J o> o C/) X X UJ 5 UJ 0) te : < C/) < 2 H CD O O) </> UJ Ü QC < 4* P? K ll I I <% "fei 'Q f

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

CHEM Come to the PASS workshop with your mock exam complete. During the workshop you can work with other students to review your work.

i;\-'i frz q > R>? >tr E*+ [S I z> N g> F 'x sa :r> >,9 T F >= = = I Y E H H>tr iir- g-i I * s I!,i --' - = a trx - H tnz rqx o >.F g< s Ire tr () -s

Properties of Graphs of Polynomial Functions Terminology Associated with Graphs of Polynomial Functions

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

FOR TESTING THill POWER PLANT STANLEY STEAM AUTOMOBILE SOME TESTS MADE WITH IT. A Thesis surmitted to the

necessita d'interrogare il cielo

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Recap from previous lecture

'5E _ -- -=:... --!... L og...

DATA MINING LECTURE 10

Jsee x dx = In Isec x + tanxl + C Jcsc x dx = - In I cscx + cotxl + C

A/P Warrants. June 15, To Approve. To Ratify. Authorize the City Manager to approve such expenditures as are legally due and

Ch 4. Linear Models for Classification

Future Self-Guides. E,.?, :0-..-.,0 Q., 5...q ',D5', 4,] 1-}., d-'.4.., _. ZoltAn Dbrnyei Introduction. u u rt 5,4) ,-,4, a. a aci,, u 4.

8. Relax and do well.

CS 188: Artificial Intelligence Spring Announcements

CHEM 10113, Quiz 5 October 26, 2011

CHEM 10123/10125, Exam 2

WALL D*TA PRINT-OUT. 3 nmth ZONE

46 D b r 4, 20 : p t n f r n b P l h tr p, pl t z r f r n. nd n th t n t d f t n th tr ht r t b f l n t, nd th ff r n b ttl t th r p rf l pp n nt n th

Mathematics Extension 1

v f e - w e ^ C c ^ ^ o e s r v i c e ^ ^.

Pledged_----=-+ ---'l\...--m~\r----

I N A C O M P L E X W O R L D

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

CHEMICAL COMPOUNDS MOLECULAR COMPOUNDS

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

(tnaiaun uaejna) o il?smitfl?^ni7wwuiinuvitgviisyiititvi2a-a a imaviitjivi5a^ qw^ww^i fiaa!i-j?s'u'uil?g'ijimqwuwiijami.wti. a nmj 1,965,333.

Classification: The rest of the story

The Naïve Bayes Classifier. Machine Learning Fall 2017

7. Relax and do well.


Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Advanced classifica-on methods

trawhmmry ffimmf,f;wnt

! J*UC4j u<s.< U l*4 3) U /r b A a ti ex Ou rta + s U fa* V. H lu< Y ^i«iy /( c * i U O rti^ ^ fx /i«w

B A P T IS M S solemnized. Hn the County of

heliozoan Zoo flagellated holotrichs peritrichs hypotrichs Euplots, Aspidisca Amoeba Thecamoeba Pleuromonas Bodo, Monosiga

Transcription:

ECE 417 Lecture 6: knn and Linear Classifiers Amit Das 09/14/2017

Acknowledgments http://research.cs.tamu.edu/prism/lectures/pr/pr_l8.pdf http://classes.engr.oregonstate.edu/eecs/spring2012/cs534/notes/kn n.pdf

Nearest Neighbor Algorithm Remember all training examples Given a new example x, find the its closest training example <x i, y i > and predict y i New example How to measure distance Euclidean (squared): x x i 2 = ( x x j j i j ) 2

Decision Boundaries: The Voronoi Diagram Given a set of points, a Voronoi diagram describes the areas that are nearest to any given point. These areas can be viewed as zones of control.

Decision Boundaries: The Voronoi Diagram Decision boundaries are formed by a subset of the Voronoi diagram of the training data Each line segment is equidistant between two points of opposite class. The more examples that are p stored, the more fragmented and complex the decision boundaries can become.

Decision Boundaries With large number of examples and possible noise in the labels, the decision boundary can become nasty! We end up overfitting the data

Example: KNearest Neighbor K = 4 New example Find the k nearest neighbors and have them vote. Has a g smoothing effect. This is especially good when there is noise in the class labels.

knearest Neighbor: Probabilistic Interpretation Can interpret knn as Maximum Aposteriori (MAP) Classifier a.k.a. Baye s classifier. Posterior Probability = p(class example) knn uses MAP Rule:

Effect of K K=1 K=15 Figures from Hastie, Tibshirani and Friedman (Elements of Statistical Learning) Larger k produces smoother boundary effect and can reduce the impact of class label noise. But when K = N, we always predict the majority class

Example I knn in action Threeclass 2D problem with nonlinearly separable, multimodal likelihoods We use the knn rule (k = 5) and the Euclidean distance The resulting decision boundaries and decision regions are shown below CSCE 666 Pattern Analysis Ricardo GutierrezOsuna CSE@TAMU 10

Example II Twodim 3class problem with unimodal likelihoods with a common mean; these classes are also not linearly separable We used the knn rule (k = 5), and the Euclidean distance as a metric CSCE 666 Pattern Analysis Ricardo GutierrezOsuna CSE@TAMU 11

Question: how to choose k? Can we choose k to minimize the mistakes that we make on training examples (training error)? K=20 K=1 A model selection problem that we will Model complexity study later

Characteristics of the knn Classifier Advantages: Simple interpretation of Baye s classifier Can discover complex nonlinear boundaries between classes Easy to Implement (10 effective lines of Matlab code) Disadvantages: Lazy learning algorithm: It defers the learning until a test example is given. Thus, it doesn t learn from the training data. It actually memorizes all the data. Considers all training data before it can classify a test example Large storage and computation requirements. Susceptible to curse of dimensionality.

.. ECL 4 (7 &ye s CI ~ss;j,, " ~,c~;re lea,.. ~:,) f~ o.f> " " t ~ :!),, s{j"..' ":,o~ Ce. f of~ c.. ~.(' ~Air/,, rriaf. ' v~ +;_~ A f ~ (~s 00/11/2017 co ~tdtt$ J<'. tc fo!v '/ 1 I t; okmj'r4 /' L, Sotn.?if(\l > Y to,,} (0 c, l~ n~j / f+o H, / ( "'a J.;.~ f,a,_..;~j ) ( h~ t!rlm8>$ I n (_ O rr r<> l.v\'v' t'~ e,ri S)H.s.l;c.!!,) 'Ob9P~c,..) ' X e tr J.e.kc.fo,... ~ Jk.\,..;,+ 6 r rice X ' } I, r,,_t lo Y) t,,,11, te~ t!s ho!cl (L) ~J lo,_... ' '() ::, \ I ~r J1n.e / ' / ~)~ ~r~~ )( i,j,1 r '> ' {~v C ~~~ ~0J ', \tj h ~+. ~1.e. /P,}. Cr t_).1'hg ftvo \, it~ \\Pl\e t: r dorlln.l'i ( ~~ ~ X ~'(r(y:<' 1.> )\) ~ lo,,.s 0 ' ' ' j? f'/) I(\ 1fY', I t...e '

X : 0 i.c!."'~ ~ C,_' ".) y ~ /,,J,,_I I <IA,, (,._. V ) ' y = ( y ;,. f Ae~ic l,.j (,J,,J / " I ~ (,., " ) t. ~ )(. y ~ ; ( 1',~$, x) ) r y t y / c1..,, c1.,~) <~ y 7 l 0 A M I? /J.,0/1 LornmWM~CH'~ ~~ ta,..s., ~5,\;"c3, I r,r i \ 'f ~ 1. ~~c..l'"'l L,~o I \ I {::!f R C~t~tvl. I A~) f"nf.e_m,()/.._, ( +Jk ~ ~~)

Fw~ 1 RAASH :o 'fe ~ f' ( / f y) ' Z rry:ty '() / :::V \ / I /o I r ( ;* 'i) y f ( '/ ) r( y\p\fo)1(,::o) 1 71 fe 7'0 f'~,.. f Ar '?'1 1 ~ 1'0 "' \ 1 _ t) ~ 1' F" cf I f' 01 i h&/\ 001\ 0 bpj,; L '..$ J:;,J (_ s l 'f._ ',_;..,,..,.. ;,.,cl

~r's ~~ ' >.+k f'e... \ ( '( ~ 1 \ x) F \ 'I i \ " ) f (Fr) 1 ") \ ( t h ~ 1). f (xjfo),.., r::o I 'f ": \ ~ " 'f: D " / 1:. \. 1 < I > HAf~ (M~x ~M~ + ~e. wf,~)

A k. t). 4,,_, ~{ µ_t<> ~ \; k~ U..1> L (x) 1 H /yc9 t>(xf 't~o) [ (r.) = 'cj L(<), '\.,~ ' : <=7 fg):: (; "~(h ~h.r c.:,. tw1t'1~+ LfJ.JJ, cc F (." f ~=o \ = 1"" c\ P O f ) l;lel.'h~ ~ '7 { ~IA< 1_ t ( x \, : I ) ~ " "' N ( µ, a;; ),wd Vo '.:. 6, X "' ~( (,M 1 / o; ) /J,, 7 J1 0 ~Z ( ~. /Jo) e. 2 (T 2 ~ t x fa,) e. 2 0 ~ 2.. \ huj L ( "\ : f ( )( \ 1 = I) f('(/y,:o)

0A,)2 (0 e. ~ d7\(j z t(x µ,) (,,P~ 2 u 2 I r 'l. '1' ( ",4., r e. ~ 202.. ~ { ~7\U '2 e.2.. 0 'l \ L (, ') ~ \ ( (. ~ JJ l )J.~) 'I, ;::: er <~ 'f'c ~1 :: 1 ' I 0/

e) ( (, a, J,_. cfv'"'' n""'.f ~.,,,) C l""m'",j. 7 ~ """"+ ~) ~,= 0 X rj N ( flo, <1"o ) )( N r{ C }I ' cr, ) [(x) w,1\ l,,_a t~ f,..j,,.,~, /,. e,, X (R..,,.,,J l~h=~), i("lr) ~~.. hi~1,j.. ~,, y: () ; y:: \ ' X N r{ ( ~~/ ~r>) X N N ( )),,_, :f1 )!'~ t. ~~ z ~o k / k~~, Lt> A.

t'' C. ' "' rf ( }J, z" ) I= I "I("' M( µ, /,) 2,f, / )J,;;,Mo ib $(~) ~ + / ~:' z~)r (l:( fl.~z~) x T ~ [ r l JJ, z, JJ. µ: ~o J t ~ )oa f> A.n,rio \e~ 7 ().L_ ~(xj MA~d 'Fi'~""' be+ccfo1 (Q vojµ,~;c. Dt'..~ Ce u. t 1,.....(<4.,,(;.., 4\ cr,jj

C~«,e1... ) ~ = o c "H1... ) r:: '.# X N rr( µ 0 / <r) X,v N(µ er) 11 :::: t, ' c ~ cr,,,.i :: (J" ~o l_~ ~r Gl 2,. ( ~) ; ( µ \ tfa/,} ~.(: d,eca~ " ),)Jc'..,y Jc~)~ ~\,Al~ 'J I ;2.. I / X ~ C () >( < ( / "" 1\.,_

/ r~ : ti~ f A + 1', r,., ( "",JO 0). ' fe ( t) ~ ~ ~ ( ';_5) /0~ ) + fl 1 & ( M {v f., ( { y) i> Te ( t") < 1:>,, Ce) /

f'ect>t\ / _!,e,~;j:,,'j/ '' A C cwj). / fk re},,, VP p I r le> r1~.\,=.,/i\ j :~, +!,,, coo.vo!s.~ r) c rer,cl,.,,, 1 "', cit~; r? :. 1 c\),.fe +"e.: r 'P ;_ t~ se. +",_ 1 0 W it ~t f o,,h,.s :: T N f + \f' ::: 50 1~ ±1 ~ "'H'~ /' Tf' 1 PN ;;;.,oo

L~4 ~ ~v4_«e,}e 41( it,.. ~ merkics?a 1 r 2o Ff J ~r.j 2 O f 20 t.3v ~ s ( ~A "'1 Tyt~ l" e /i.j..br~) T=N 1= N 1 'Tr' lo to + o 0 O ( r ire,,,(!._ /,J./.Jf. ) ~ f1> TP TP..i F N ao 10 + (0 Q, CJ t rm ( ~ :::.. 12e ao.jj Se r,_s,.j..; v, ~ TP R. ) T.P 11 : 0 82 '. lr1 t"' CA~,J "',.J b V"' C I ~.,.,._?.1,,,,,,_,t/,s. (o.ffks)

f..50 5 f' ~o iore. 2. 1 ~ J fe~ fteci'8'>&r'l 2 10 <r + I 4 \ ~1... /2f /7 f \ J tf> A e. 0 ~ F,~ 1 (_ wo''1) ( w+) r \iz.~ f ~e r,;.si 111\. 1~,µ I ~faf~~ low V/\ ( v.v 1 RerlA\\ 'O 9,... J=.;~ ;11J. ~ C,Of\~.: I I '.> (A, ~~ d~t.,,,, r.o~ejva\ I

NAME: Exam 2 Page 3 Problem 1 (20 points) You want to classify zoo animals. Your zoo only has two species: elephants and giraffes. There are more elephants than giraffes: if Y is the species, p Y (elephant) = p Y (giraffe) = e e + 1 1 e + 1 where e = 2.718... is the base of the natural logarithm. The height of giraffes is Gaussian, with mean µ G = 5 meters and variance σg 2 = 1. The height of elephants is also Gaussian, with mean µ E = 3 and variance σe 2 = 1. Under these circumstances, the minimum probability of error classifier is { giraffe x > θ ŷ(x) = elephant x < θ Find the value of θ that minimizes the probability of error.

\/ ::: e r~f h~ :: o j;"~ffe = I e_ < f ( / c e I~ th~ ) F ( f: I te.. :.: <' I\ ~ ('~~) ( =. Ii<: I I yz O. 1 ~ \,. C \".,,,.;f.' 0 n fh.e. t, _:1) Q f }. X N N(::>/ )) X A) t( (5/,) ~ /.i ele r 1i is S? (t"~ t~ ~e.'3 ~ 15 g 'l I + '/ /,'4> f",r. t/\01 t ~J!h,. rn, fj fe,.,_,,._ M :.5 ~~: 3 r / CT~ \~ ~~I\ µ,.,4.,) )

t' In (e) ~ ;;) 5 1..3 2 @) I 1 \ 4 45 " 4 2 / 0/ Vb rq + {~J?e f ( i o ~ ( / Po) t ('_ er / '>( ~ 4.s 'I< < 45 E> Q, (4.5 3 ) ' I te.. I It,, ve 7, ~ (}.I. f I( I + ) t e rr :: (_,J( Q ( 5 =~,i.?) Q \f E'