Bayesian Classifier. v MAP. argmax v j V P(x 1,x 2,...,x n v j )P(v j ) ,..., x. x) argmax. )P(v j

Size: px
Start display at page:

Download "Bayesian Classifier. v MAP. argmax v j V P(x 1,x 2,...,x n v j )P(v j ) ,..., x. x) argmax. )P(v j"

Transcription

1 Bayesa Classfer f:xv, fte set of alues Istaces xx ca be descrbed as a collecto of features x = (x 1, x 2, x x 2 {0,1} Ge a example, assg t the most probable alue V Bayes Rule: MAP argmax V MAP argmax V x argmax V Notatoal coeto: P(y meas P(Y=y P(x 1,x 2,...,x P(x 1,x 2,...,x argmax V P(x 1,x 2,...,x x 1,x 2,..., x Bayesa Learg CS446 Sprg 17 1

2 Bayesa Classfer V MAP = argmax P(x 1, x 2,, x Ge trag data we ca estmate the two terms. Estmatg s easy. E.g., uder the bomal dstrbuto assumpto, cout the umber of tmes appears the trag data. Howeer, t s ot feasble to estmate P(x 1, x 2,, x I ths case we hae to estmate, for each target alue, the probablty of each stace (most of whch wll ot occur. I order to use a Bayesa classfers practce, we eed to make assumptos that wll allow us to estmate these quattes. Bayesa Learg CS446 Sprg 17 2

3 P(x 1,x 2 P(x1 P(x1... P(x,..., x 1 x x x 2 2 2,..., x,..., x,..., x,,, Nae Bayes V MAP = argmax P(x 1, x 2,, x P(x P(x P(x 2 2 2,..., x x x 3 3,..., x,..., x,, P(x P(x 3 3,..., x x 4,..., x,... P(x Assumpto: feature alues are depedet ge the target alue 1 P(x Bayesa Learg CS446 Sprg 17 3

4 Nae Bayes (2 V MAP = argmax P(x 1, x 2,, x Assumpto: feature alues are depedet ge the target alue P(x 1 = b 1, x 2 = b 2,,x = b = = 1 P(x = b = Geerate model: Frst choose a alue V accordg to For each : choose x 1 x 2,, x accordg to P(x k Bayesa Learg CS446 Sprg 17 4

5 Nae Bayes (3 V MAP = argmax P(x 1, x 2,, x Assumpto: feature alues are depedet ge the target alue P(x 1 = b 1, x 2 = b 2,,x = b = = 1 P(x = b = Learg method: Estmate V + V parameters ad use them to make a predcto. (How to estmate? Notce that ths s learg wthout search. Ge a collecto of trag examples, you ust compute the best hypothess (ge the assumptos. Ths s learg wthout tryg to achee cosstecy or ee approxmate cosstecy. Why does t work? Bayesa Learg CS446 Sprg 17 5

6 Codtoal Idepedece Notce that the features alues are codtoally depedet ge the target alue, ad are ot requred to be depedet. Example: The Boolea features are x ad y. We defe the label to be l = f(x,y=xy oer the product dstrbuto: p(x=0=p(x=1=1/2 ad p(y=0=p(y=1=1/2 The dstrbuto s defed so that x ad y are depedet: p(x,y = p(xp(y That s: X=0 X=1 Y=0 ¼ (l = 0 ¼ (l = 0 Y=1 ¼ (l = 0 ¼ (l = 1 But, ge that l =0: p(x=1 l =0 = p(y=1 l =0 = 1/3 whle: p(x=1,y=1 l =0 = 0 so x ad y are ot codtoally depedet. Bayesa Learg CS446 Sprg 17 6

7 Codtoal Idepedece The other drecto also does ot hold. x ad y ca be codtoally depedet but ot depedet. Example: We defe a dstrbuto s.t.: l =0: p(x=1 l =0 =1, p(y=1 l =0 = 0 l =1: p(x=1 l =1 =0, p(y=1 l =1 = 1 ad assume, that: p(l =0 = p(l =1=1/2 X=0 X=1 Y=0 0 (l= 0 ½ (l= 0 Y=1 ½ (l= 1 0 (l= 1 Ge the alue of l, x ad y are depedet (check What about ucodtoal depedece? p(x=1 = p(x=1 l =0p(l =0+p(x=1 l =1p(l =1 = 0.5+0=0.5 p(y=1 = p(y=1 l =0p(l =0+p(y=1 l =1p(l =1 = 0+0.5=0.5 But, p(x=1, y=1=p(x=1,y=1 l =0p(l =0+p(x=1,y=1 l =1p(l =1 = 0 so x ad y are ot depedet. Bayesa Learg CS446 Sprg 17 7

8 Naïe Bayes Example argmax P(x NB V Day Outlook Temperature Humdty Wd PlayTes 1 Suy Hot Hgh Weak No 2 Suy Hot Hgh Strog No 3 Oercast Hot Hgh Weak Yes 4 Ra Mld Hgh Weak Yes 5 Ra Cool Normal Weak Yes 6 Ra Cool Normal Strog No 7 Oercast Cool Normal Strog Yes 8 Suy Mld Hgh Weak No 9 Suy Cool Normal Weak Yes 10 Ra Mld Normal Weak Yes 11 Suy Mld Normal Strog Yes 12 Oercast Mld Hgh Strog Yes 13 Oercast Hot Normal Weak Yes 14 Bayesa Learg Ra Mld Hgh CS446 Sprg 17 Strog No 8

9 Estmatg Probabltes NB argmax {yes,o} P(x obserato How do we estmate P(obserato? Bayesa Learg CS446 Sprg 17 9

10 argmax Example P(x NB V Compute P(PlayTes= yes; P(PlayTes= o Compute P(outlook= s/oc/r PlayTes= yes/o (6 umbers Compute P(Temp= h/mld/cool PlayTes= yes/o (6 umbers Compute P(humdty= h/or PlayTes= yes/o (4 umbers Compute P(wd= w/st PlayTes= yes/o (4 umbers Bayesa Learg CS446 Sprg 17 10

11 Compute P(PlayTes= yes; P(PlayTes= o Compute P(outlook= s/oc/r PlayTes= yes/o (6 umbers Compute P(Temp= h/mld/cool PlayTes= yes/o (6 umbers Compute P(humdty= h/or PlayTes= yes/o (4 umbers Compute P(wd= w/st PlayTes= yes/o (4 umbers Ge a ew stace: (Outlook=suy; Temperature=cool; Humdty=hgh; Wd=strog Predct: PlayTes=? argmax Example P(x NB V Bayesa Learg CS446 Sprg 17 11

12 argmax Example NB V Ge: (Outlook=suy; Temperature=cool; Humdty=hgh; Wd=strog P(x P(PlayTes= yes=9/14=0.64 P(PlayTes= o=5/14=0.36 P(outlook = suy yes= 2/9 P(outlook = suy o= 3/5 P(temp = cool yes = 3/9 P(temp = cool o = 1/5 P(humdty = h yes = 3/9 P(humdty = h o = 4/5 P(wd = strog yes = 3/9 P(wd = strog o= 3/5 P(yes,.. ~ P(o,.. ~ Bayesa Learg CS446 Sprg 17 12

13 argmax Example NB V Ge: (Outlook=suy; Temperature=cool; Humdty=hgh; Wd=strog P(x P(PlayTes= yes=9/14=0.64 P(PlayTes= o=5/14=0.36 P(outlook = suy yes= 2/9 P(outlook = suy o= 3/5 P(temp = cool yes = 3/9 P(temp = cool o = 1/5 P(humdty = h yes = 3/9 P(humdty = h o = 4/5 P(wd = strog yes = 3/9 P(wd = strog o= 3/5 P(yes,.. ~ P(o,.. ~ P(o stace = /( =0.795 What f we were asked about Outlook=OC? Bayesa Learg CS446 Sprg 17 13

14 Estmatg Probabltes How do we estmate P(word k? As we suggested before, we made a Bomal assumpto; the: # (word k appears trag documets P(word k #( documets Sparsty of data s a problem -- f s small, the estmate s ot accurate -- f k s 0, t wll domate the estmate: we wll eer predct f a word that eer appeared trag (wth appears the test data argmax {lke,dslke} P(x word NB k Bayesa Learg CS446 Sprg 17 14

15 Robust Estmato of Probabltes argmax {lke,dslke} P(x word NB Ths process s called smoothg. There are may ways to do t, some better ustfed tha others; A emprcal ssue. k mp P(xk m Here: k s # of occurreces of the word the presece of s # of occurreces of the label p s a pror estmate of (e.g., uform m s equalet sample sze (# of labels Is ths a reasoable defto? Bayesa Learg CS446 Sprg 17 15

16 Robust Estmato of Probabltes Smoothg: Commo alues: P(x Laplace Rule: for the Boolea case, p=1/2, m=2 P(x k k k m k 1 mp 2 Lear to classfy text: p = 1/( alues (uform m= alues Bayesa Learg CS446 Sprg 17 16

17 Assume a Bomal r..: p(k,µ = C k µ k (1- µ -k Robust Estmato We saw that the maxmum lkelhood estmate s µ ML = k/ I order to compute the MAP estmate, we eed to assume a pror. It s easer to assume a pror of the form: p(µ = µ a-1 (1- µ b-1 (a ad b are called the hyper parameters The pror ths case s the beta dstrbuto, ad t s called a cougate pror, sce t has the same form as the posteror. Ideed, t s easy to compute the posteror: p(µ D ~= p(d µp(µ = µ a+k-1 (1- µ b+-k-1 Therefore, as we hae show before (dfferetate the log posteror µ map = k+a-1/(+a+b-2 The posteror mea: E(µ D = s 01 µp(µ Ddµ = a+k/(a+b+ Uder the uform pror, the posteror mea of obserg (k, s: k+1/+2 Bayesa Learg CS446 Sprg 17 17

18 Naïe Bayes: Two Classes NB argmax V P(x Notce that the aïe Bayes method ges a method for predctg rather tha a explct classfer. I the case of two classes, {0,1} we predct that =1 ff: P(x P(x Bayesa Learg CS446 Sprg 17 18

19 Naïe Bayes: Two Classes argmax P(x NB V Notce that the aïe Bayes method ges a method for predctg rather tha a explct classfer. I the case of two classes, {0,1} we predct that =1 ff: P(x P(x Deote : p P(x 1 1, q P(x (1- q Bayesa Learg CS446 Sprg p q x x (1- p 1-x 1-x 1 19

20 Naïe Bayes: Two Classes I the case of two classes, {0,1} we predct that =1 ff: p q x x (1 - p (1 - q 1-x 1-x (1 - p (1 - q p ( 1 - p q ( 1 - q x x 1 Bayesa Learg CS446 Sprg 17 20

21 Naïe Bayes: Two Classes I the case of two classes, {0,1} we predct that =1 ff: p q x x (1 - p (1 - q 1-x 1-x Take logarthm; we predct log p log 1 - q ff : p (log 1 - p 1 1 (1 - p (1 - q q log 1 - q p ( 1 - p q ( 1 - q x x x 0 1 Bayesa Learg CS446 Sprg 17 21

22 I the case of two classes, {0,1} we predct that =1 ff: p x 1-x 1 (1 - p ( 1 p (1 - p p x 1-x 0 q (1 - q q 1 0 (1 - q ( q Take logarthm; we predct 1 ff : log Naïe Bayes: Two Classes p log 1 - q p (log 1 - p We get that ae Bayes s a lear separator wth q log 1 - q x x x 0 1 w log p log q log p 1- q 1- p 1- q q 1- p f p q the w 0 ad the feature s rreleat Bayesa Learg CS446 Sprg 17 22

23 I the case of two classes we hae that: but sce We get: Naïe Bayes: Two Classes log Whch s smply the logstc fucto. 1 x 0 x 1 x 1 x 1- b The learty of NB prodes a better explaato for why t works. w 1 exp(- x 0 x 1 w x We hae: A = 1-B; Log(B/A = -C. The: Exp(-C = B/A = = (1-A/A = 1/A 1 = 1 + Exp(-C = 1/A A = 1/(1+Exp(-C b Bayesa Learg CS446 Sprg 17 23

24 A few more NB examples Bayesa Learg CS446 Sprg 17 24

25 Example: Learg to Classfy Text argmax NB V Istace space X: Text documets Istaces are labeled accordg to f(x=lke/dslke Goal: Lear ths fucto such that, ge a ew documet you ca use t to decde f you lke t or ot How to represet the documet? How to estmate the probabltes? How to classfy? P(x Bayesa Learg CS446 Sprg 17 25

26 Documet Represetato Istace space X: Text documets Istaces are labeled accordg to y = f(x = lke/dslke How to represet the documet? A documet wll be represeted as a lst of ts words The represetato questo ca be ewed as the geerato questo We hae a dctoary of words (therefore 2 parameters We hae documets of sze N: ca accout for word posto & cout Hag a parameter for each word & posto may be too much: # of parameters: 2 x N x (2 x 100 x 50,000 ~ 10 7 Smplfyg Assumpto: The probablty of obserg a word a documet s depedet of ts locato Ths stll allows us to thk about two ways of geeratg the documet Bayesa Learg CS446 Sprg 17 26

27 Classfcato a Bayes Rule (B We wat to compute argmax y P(y D = argmax y P(D y P(y/P(D = = argmax y P(D yp(y Our assumptos wll go to estmatg P(D y: 1. Multarate Beroull I. To geerate a documet, frst decde f t s good (y=1 or bad (y=0. II. III. Parameters: 1. Prors: P(y=0/ w 2 Dctoary p(w =0/1 y=0/1 Ge that, cosder your dctoary of words ad choose w to your documet wth probablty p(w y, rrespecte of aythg else. If the sze of the dctoary s V =, we ca the wrte P(d y = 1 P(w =1 y b P(w =0 y 1-b Where: p(w=1/0 y: the probablty that w appears/does-ot a y-labeled documet. b {0,1} dcates whether word w occurs documet d 2+2 parameters: Estmatg P(w =1 y ad P(y s doe the ML way as before (coutg. Bayesa Learg CS446 Sprg 17 27

28 We wat to compute argmax y P(y D = argmax y P(D y P(y/P(D = = argmax y P(D yp(y Our assumptos wll go to estmatg P(D y: 2. Multomal I. To geerate a documet, frst decde f t s good (y=1 or bad (y=0. II. III. A Multomal Model Parameters: 1. Prors: P(y=0/ w 2 Dctoary p(w =0/1 y=0/1 N dctoary tems are chose to D Ge that, place N words to d, such that w s placed wth probablty P(w y, ad N P(w y =1. The Probablty of a documet s: P(d y N!/ 1!... k! P(w 1 y 1 p(w k y k Where s the # of tmes w appears the documet. Same # of parameters: 2+2, where = Dctoary, but the estmato s doe a bt dfferetly. (HW. Bayesa Learg CS446 Sprg 17 28

29 Model Represetato The geerate model these two cases s dfferet µ µ label label Documets d Appear Documets d Appear (d Words w Posto p Beroull: A bary arable correspods to a documet d ad a dctoary word w, ad t takes the alue 1 f w appears d. Documet topc/label s goered by a pror µ, ts topc (label, ad the arable the tersecto of the plates s goered by µ ad the Beroull parameter for the dctoary word w Bayesa Learg CS446 Sprg 17 Multomal: Words do ot correspod to dctoary words but to postos (occurreces the documet d. The teral arable s the W(D,P. These arables are geerated from the same multomal dstrbuto, ad deped o the topc/label. 29

30 Geeral NB Scearo We assume a mxture probablty model, parameterzed by µ. Dfferet compoets {c 1,c 2, c k } of the model are parameterze by dsot subsets of µ. The geerate story: A documet d s created by (1 selectg a compoet accordg to the prors, P(c µ, the (2 hag the mxture compoet geerate a documet accordg to ts ow parameters, wth dstrbuto P(d c, µ So we hae: P(d µ = 1 k P(c µ P(d c,µ I the case of documet classfcato, we assume a oe to oe correspodece betwee compoets ad labels. Bayesa Learg CS446 Sprg 17 30

31 Naïe Bayes: Cotuous Features X ca be cotuous We ca stll use Ad Bayesa Learg CS446 Sprg 17 31

32 Naïe Bayes: Cotuous Features X ca be cotuous We ca stll use Ad Naïe Bayes classfer: Bayesa Learg CS446 Sprg 17 32

33 Naïe Bayes: Cotuous Features X ca be cotuous We ca stll use Ad Naïe Bayes classfer: Assumpto: P(X Y has a Gaussa dstrbuto Bayesa Learg CS446 Sprg 17 33

34 The Gaussa Probablty Dstrbuto Gaussa probablty dstrbuto also called ormal dstrbuto. It s a cotuous dstrbuto wth pdf: 2 ( x = mea of dstrbuto p( x e 2 2 = arace of dstrbuto x s a cotuous arable (- x Probablty of x beg the rage [a, b] caot be ealuated aalytcally (has to be looked up a table (x 1 2 p(x 2 e 2 2 gaussa Bayesa Learg CS446 Sprg 17 x 34

35 Naïe Bayes: Cotuous Features P(X Y s Gaussa Trag: estmate mea ad stadard deato Note that the followg sldes abuse otato sgfcatly. Sce P(x =0 for cotues dstrbutos, we thk of P (X=x Y=y, ot as a classc probablty dstrbuto, but ust as a fucto f(x = N(x, ¹, ¾ 2. f(x behaes as a probablty dstrbuto the sese that 8 x, f(x 0 ad the alues add up to 1. Also, ote that f(x satsfes Bayes Rule, that s, t s true that: f Y (y X = x = f X (x Y = y f Y (y/f X (x Bayesa Learg CS446 Sprg 17 35

36 Naïe Bayes: Cotuous Features P(X Y s Gaussa Trag: estmate mea ad stadard deato X 1 X 2 X 3 Y Bayesa Learg CS446 Sprg 17 36

37 Naïe Bayes: Cotuous Features P(X Y s Gaussa Trag: estmate mea ad stadard deato X 1 X 2 X 3 Y Bayesa Learg CS446 Sprg 17 37

38 Recall: Naïe Bayes, Two Classes I the case of two classes we hae that: but sce We get: log 1 x 0 x w x Whch s smply the logstc fucto (also used the eural etwork represetato The same formula ca be wrtte for cotuous features 1 x 1-1 x 1 exp(- 0 x 1 b w x b Bayesa Learg CS446 Sprg 17 38

39 Logstc Fucto: Cotuous Features Logstc fucto for Gaussa features Note that we are usg rato of probabltes, sce x s a cotuous arable. Bayesa Learg CS446 Sprg 17 39

40 Hdde Marko Model (HMM A probablstc geerate model: models the geerato of a obsered sequece. At each tme step, there are two arables: Curret state (hdde, Obserato s 1 s 2 s 3 s 4 s 5 s 6 Elemets o 1 o 2 o 3 Ital state probablty P(s 1 ( S parameters Trasto probablty P(s t s t-1 ( S ^2 parameters Obserato probablty P(o t s t ( S x O parameters As before, the graphcal model s a ecodg of the depedece assumptos: P(s t s t-1, s t-2, s 1 =P(s t s t-1 P(o t s T,,s t, s 1, o T,,o t, o 1 =P(o t s t Examples: POS taggg, Sequetal Segmetato Bayesa Learg CS446 Sprg o 4 o 5 o 6

41 HMM for Shallow Parsg States: {B, I, O} Obseratos: Actual words ad/or part-of-speech tags s 1 =B s 2 =I s 3 =O s 4 =B s 5 =I s 6 =O o 1 Mr. o 2 Brow o 3 blamed o 4 Mr. o 5 Bob o 6 for Bayesa Learg CS446 Sprg 17 41

42 HMM for Shallow Parsg s 1 =B s 2 =I s 3 =O s 4 =B s 5 =I s 6 =O o 1 o 2 o 3 o 4 o 5 o 6 Mr. Brow blamed Mr. Bob for Ital state Trasto probablty: probablty: Obserato Probablty: P(s 1 =B,P(s 1 t =B s =I,P(s t-1 =B,P(s 1 =O t =I s t-1 =B,P(s t =O s t-1 =B, P(o P(s t =B s t-1 =I,P(s t = Mr. s t =I s t =B,P(o t-1 =I,P(s t = Brow s t =O s t-1 =I, t =B,, P(o t = Mr. s t =I,P(o t = Brow s t =I,, Ge a seteces, we ca ask what the most lkely state sequece s Bayesa Learg CS446 Sprg 17 42

43 Three Computatoal Problems Decodg fdg the most lkely path Hae: model, parameters, obseratos (data Wat: most lkely states sequece S S... S arg max p( S S... S O arg max p( S S... S, O * * * 1 2 T S S... S 1 2 T S S... S 1 2 T 1 2 T 1 2 Ealuato computg obserato lkelhood Hae: model, parameters, obseratos (data Wat: the lkelhood to geerate the obsered data p( O p( O S S... S p( S S... S S S... S B I I I B 1 2 T T I both cases a smple mded soluto depeds o S T steps Trag estmatg parameters Supersed: Hae: model, aotated data(data + states sequece Usupersed: Hae: model, data Wat: parameters Bayesa Learg CS446 Sprg a c d a d T T 43

44 Fdg most lkely state sequece HMM (1 P (s k ; s k 1; : : : ; s1; o k ; o k 1; : : : ; o1 = P (o k o k 1; o k 2; : : : ; o 1 ; s k ; s k 1; : : : ; s 1 P (o k 1; o k 2; : : : ; o1; s k ; s k 1; : : : ; s1 = P (o k s k P (o k 1; o k 2; : : : ; o1; s k ; s k 1; : : : ; s1 = P (o k s k P (s k s k 1; s k 2; : : : ; s1; o k 1; o k 2; : : : ; o1 P (s k 1; s k 2; : : : ; s1; o k 1; o k 2; : : : ; o1 = P (o k s k P (s k s k 1 P (s k 1; s k 2; : : : ; s1; o k 1; o k 2; : : : ; o1 k 1 Y = P (o k s k [ P (s t +1 s t P (o t s t ] P (s1 t=1 Bayesa Learg CS446 Sprg 17 44

45 Fdg most lkely state sequece HMM (2 arg max P (s s k ;s k ; s k 1; : : : ; s1 o k ; o k 1; : : : ; o1 k 1;:::;s 1 P (s = arg max k ; s k 1; : : : ; s1; o k ; o k 1; : : : ; o1 s k ;s k 1;:::;s1 P (o k ; o k 1; : : : ; o1 = arg max s k ;s P (s k ; s k 1; : : : ; s1; o k ; o k 1; : : : ; o1 k 1;:::;s1 = arg max s k ;s P (o k 1;:::;s1 k s k [ k 1 Y t=1 P(s t +1 s t P (o t s t ] P (s1 Bayesa Learg CS446 Sprg 17 45

46 = max s k Fdg most lkely state sequece HMM (3 k 1 Y max s k ;s P (o k 1;:::;s1 k sk [ P (s t +1 s t P (o t st ] P (s1 t=1 k 1 Y = maxp (o s k s k max k s [ k 1;:::;s1 max t=1 A fucto of s k P(s t +1 s t P (o t s t ] P (s1 P (o k s k max s k 1 [ P(s k s k 1 P (o k 1 s k 1] k 2 Y s [ k 2;:::;s1 t=1 P (s t +1 s t P (o t s t ] P(s1 = maxp (o s k k s k max s [ P(s k 1 k s k 1 P (o k 1 s k 1 ] max s [ P (s k 2 k 1 s k 2 P (o k 2 s k 2] : : : max [ P (s2 s1 P(o1 s1] P (s1 s1 Bayesa Learg CS446 Sprg 17 46

47 Fdg most lkely state sequece HMM (4 max s k max s k 2 P(o k s k max s [ P (s k 1 k s k 1 P(o k 1 s k 1] [P(s k 1 s k 2 P(o k 2 s k 2] : : : max s 2 [P(s3 s2 P (o2 s2] max s 1 [P(s2 s1 P (o1 s1] P(s1 Vterb s Algorthm Dyamc Programmg Bayesa Learg CS446 Sprg 17 47

48 Learg the Model Estmate Ital state probablty P (s 1 Trasto probablty P(s t s t-1 Obserato probablty P(o t s t Usupersed Learg (states are ot obsered EM Algorthm Supersed Learg (states are obsered; more commo ML Estmate of aboe terms drectly from data Notce that ths s completely aalogues to the case of ae Bayes, ad essetally all other models. Bayesa Learg CS446 Sprg 17 48

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Naïve Bayes MIT Course Notes Cynthia Rudin

Naïve Bayes MIT Course Notes Cynthia Rudin Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.

More information

Generative classification models

Generative classification models CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY 3 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER I STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

22 Nonparametric Methods.

22 Nonparametric Methods. 22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

6. Nonparametric techniques

6. Nonparametric techniques 6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

Simulation Output Analysis

Simulation Output Analysis Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

ρ < 1 be five real numbers. The

ρ < 1 be five real numbers. The Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace

More information

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier arametrc Dest Estmato: Baesa Estmato. Naïve Baes Classfer Baesa arameter Estmato Suppose we have some dea of the rage where parameters θ should be Should t we formalze such pror owledge hopes that t wll

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

7. Joint Distributions

7. Joint Distributions 7. Jot Dstrbutos Chrs Pech ad Mehra Saham Ma 2017 Ofte ou wll work o problems where there are several radom varables (ofte teractg wth oe aother. We are gog to start to formall look at how those teractos

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

Bayes Decision Theory - II

Bayes Decision Theory - II Bayes Decso Theory - II Ke Kreutz-Delgado (Nuo Vascocelos) ECE 175 Wter 2012 - UCSD Nearest Neghbor Classfer We are cosderg supervsed classfcato Nearest Neghbor (NN) Classfer A trag set D = {(x 1,y 1 ),,

More information

BIOREPS Problem Set #11 The Evolution of DNA Strands

BIOREPS Problem Set #11 The Evolution of DNA Strands BIOREPS Problem Set #11 The Evoluto of DNA Strads 1 Backgroud I the md 2000s, evolutoary bologsts studyg DNA mutato rates brds ad prmates dscovered somethg surprsg. There were a large umber of mutatos

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134 Lecture 3 Naïve Baes, Mamum Etro ad Tet Classfcato COSI 34 Codtoal Parameterzato Two RVs: ItellgeceI ad SATS ValI = {Hgh,Low}, ValS={Hgh,Low} A ossble jot dstrbuto Ca descrbe usg cha rule as PI,S PIPS

More information

Part I: Background on the Binomial Distribution

Part I: Background on the Binomial Distribution Part I: Bacgroud o the Bomal Dstrbuto A radom varable s sad to have a Beroull dstrbuto f t taes o the value wth probablt "p" ad the value wth probablt " - p". The umber of "successes" "" depedet Beroull

More information

MA/CSSE 473 Day 27. Dynamic programming

MA/CSSE 473 Day 27. Dynamic programming MA/CSSE 473 Day 7 Dyamc Programmg Bomal Coeffcets Warshall's algorthm (Optmal BSTs) Studet questos? Dyamc programmg Used for problems wth recursve solutos ad overlappg subproblems Typcally, we save (memoze)

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Parameter Estimation

Parameter Estimation arameter Estmato robabltes Notatoal Coveto Mass dscrete fucto: catal letters Desty cotuous fucto: small letters Vector vs. scalar Scalar: la Vector: bold D: small Hgher dmeso: catal Notes a cotuous state

More information

Continuous Distributions

Continuous Distributions 7//3 Cotuous Dstrbutos Radom Varables of the Cotuous Type Desty Curve Percet Desty fucto, f (x) A smooth curve that ft the dstrbuto 3 4 5 6 7 8 9 Test scores Desty Curve Percet Probablty Desty Fucto, f

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set. Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the

More information

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE STATISTICAL INFERENCE The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

Bayesian belief networks

Bayesian belief networks Lecture 14 ayesa belef etworks los Hauskrecht mlos@cs.ptt.edu 5329 Seott Square Desty estmato Data: D { D1 D2.. D} D x a vector of attrbute values ttrbutes: modeled by radom varables { 1 2 d} wth: otuous

More information

(b) By independence, the probability that the string 1011 is received correctly is

(b) By independence, the probability that the string 1011 is received correctly is Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty

More information

L5 Polynomial / Spline Curves

L5 Polynomial / Spline Curves L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a

More information

Chapter 8: Statistical Analysis of Simulated Data

Chapter 8: Statistical Analysis of Simulated Data Marquette Uversty MSCS600 Chapter 8: Statstcal Aalyss of Smulated Data Dael B. Rowe, Ph.D. Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 08 by Marquette Uversty MSCS600 Ageda 8. The Sample

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

CS 3710 Advanced Topics in AI Lecture 17. Density estimation. CS 3710 Probabilistic graphical models. Administration

CS 3710 Advanced Topics in AI Lecture 17. Density estimation. CS 3710 Probabilistic graphical models. Administration CS 37 Avace Topcs AI Lecture 7 esty estmato Mlos Hauskrecht mlos@cs.ptt.eu 539 Seott Square CS 37 robablstc graphcal moels Amstrato Mterm: A take-home exam week ue o Weesay ovember 5 before the class epes

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Dimensionality reduction Feature selection

Dimensionality reduction Feature selection CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data

More information

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever. 9.4 Sequeces ad Seres Pre Calculus 9.4 SEQUENCES AND SERIES Learg Targets:. Wrte the terms of a explctly defed sequece.. Wrte the terms of a recursvely defed sequece. 3. Determe whether a sequece s arthmetc,

More information

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should

More information

8.1 Hashing Algorithms

8.1 Hashing Algorithms CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega

More information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018 Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of

More information

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots

More information

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Introduction to Matrices and Matrix Approach to Simple Linear Regression Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis) We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

Chapter 2 Supplemental Text Material

Chapter 2 Supplemental Text Material -. Models for the Data ad the t-test Chapter upplemetal Text Materal The model preseted the text, equato (-3) s more properl called a meas model. ce the mea s a locato parameter, ths tpe of model s also

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 17

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 17 Itroucto to Ecoometrcs (3 r Upate Eto) by James H. Stock a Mark W. Watso Solutos to O-Numbere E-of-Chapter Exercses: Chapter 7 (Ths erso August 7, 04) 05 Pearso Eucato, Ic. Stock/Watso - Itroucto to Ecoometrcs

More information

Nonparametric Density Estimation Intro

Nonparametric Density Estimation Intro Noarametrc Desty Estmato Itro Parze Wdows No-Parametrc Methods Nether robablty dstrbuto or dscrmat fucto s kow Haes qute ofte All we have s labeled data a lot s kow easer salmo bass salmo salmo Estmate

More information

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS SPECIAL CONSIDERAIONS FOR VOLUMERIC Z-ES FOR PROPORIONS Oe s stctve reacto to the questo of whether two percetages are sgfcatly dfferet from each other s to treat them as f they were proportos whch the

More information

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS Producto Systems ad Iformato Egeerg Volume 5 (2009), pp. 4-50. ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS PÉTER BARABÁS Uversty of Msolc, Hugary Departmet of Iformato Techology barabas@t.u-msolc.hu

More information

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation CS 750 Mache Learg Lecture 5 esty estmato Mlos Hausrecht mlos@tt.edu 539 Seott Square esty estmato esty estmato: s a usuervsed learg roblem Goal: Lear a model that rereset the relatos amog attrbutes the

More information

CHAPTER 3 POSTERIOR DISTRIBUTIONS

CHAPTER 3 POSTERIOR DISTRIBUTIONS CHAPTER 3 POSTERIOR DISTRIBUTIONS If scece caot measure the degree of probablt volved, so much the worse for scece. The practcal ma wll stck to hs apprecatve methods utl t does, or wll accept the results

More information

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Mache Learg CSE6740/CS764/ISYE6740, Fall 0 Itroducto to Regresso Le Sog Lecture 4, August 30, 0 Based o sldes from Erc g, CMU Readg: Chap. 3, CB Mache learg for apartmet hutg Suppose ou are to move to

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

Interpolated Markov Models for Gene Finding

Interpolated Markov Models for Gene Finding Iterpolated Markov Models for Gee Fdg BMI/CS 776 www.bostat.wsc.edu/bm776/ Sprg 2009 Mark Crave crave@bostat.wsc.edu The Gee Fdg Task Gve: a ucharacterzed DNA sequece Do: locate the gees the sequece, cludg

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information