Course Content. What is Classification? Chapter 4 Objectives

Size: px
Start display at page:

Download "Course Content. What is Classification? Chapter 4 Objectives"

Transcription

1 Prcples of Kowledge Dscovery Fall 007 Chapter 4: Classfcato Dr. Osmar R. Zaïae Uversty of Alberta Course Cotet Itroducto to Mg Assocato Aalyss Sequetal Patter Aalyss Classfcato ad predcto Cotrast Sets Clusterg Outler Detecto Web Mg Other topcs f tme permts (spatal data, bomedcal data, etc.) Uversty of Alberta Uversty of Alberta Chapter 4 Objectves Lear basc techques for data classfcato ad predcto. Itroduce techques such as Neural Networks, Naïve Bayesa Classfcato, k-nearest Neghbors, Decso Trees & Assocatve Classfers Realze the dfferece betwee supervsed classfcato, predcto ad usupervsed classfcato of data. Uversty of Alberta 3 What s Classfcato? The goal of data classfcato s to orgaze ad categorze data dstct classes. A model s frst created based o the data dstrbuto. The model s the used to classfy ew data. Gve the model, a class ca be predcted for ew data.? Wth classfcato, I ca predct whch bucket to put the ball, but I ca t predct the weght of the ball. 3 4 Uversty of Alberta 4

2 Applcato Classfcato Learg a Model Trag Set (labeled) Credt approval Target marketg Medcal dagoss Defectve parts detfcato maufacturg Crme zog Treatmet effectveess aalyss Etc. Classfcato Model New ulabeled data LabelgClassfcato Uversty of Alberta 5 Uversty of Alberta 6 Classfcato s a three-step process Classfcato s a three-step process. Model costructo (Learg): Each tuple s assumed to belog to a predefed class, as determed by oe of the attrbutes, called the class label. The set of all tuples used for costructo of the model s called trag set. The model s represeted the followg forms: Classfcato rules, (IF-THEN statemets), Decso tree Mathematcal formulae. Model Evaluato (Accuracy): Estmate accuracy rate of the model based o a test set. The kow label of test sample s compared wth the classfed result from the model. Accuracy rate s the percetage of test set samples that are correctly classfed by the model. Test set s depedet of trag set otherwse over-fttg wll occur. Uversty of Alberta 7 Uversty of Alberta 8

3 Classfcato s a three-step process Classfcato wth Holdout 3. Model Use (Classfcato): The model s used to classfy usee objects. Gve a class label to a ew tuple Predct the value of a actual attrbute Trag Testg Derve Classfer (Model) Estmate Accuracy Holdout (partto D two; oe for test oe for trag) Radom sub-samplg (repeated holdouts wth dfferet parttog) K-fold cross valdato (K parttos of D; do k expermets each wth a dfferet test partto. No overlap as Radom sub-samplg. Typcally k0) Bootstrappg (N samples (typcally of sze 63% of D) wth replacemets) Leave-oe-out ( D expermets wth D - for trag ad oe for test) Uversty of Alberta 9 Uversty of Alberta 0. Classfcato Process (Learg). Classfcato Process (Accuracy Evaluato) Trag Classfcato Algorthms Testg Classfer (Model) Name Icome Age Credt ratg Bruce Low <30 bad Dave Medum [30..40] good Wllam Hgh <30 good Mare Medum >40 good Ae Low [30..40] good Chrs Medum <30 bad Classfer (Model) IF Icome Hgh OR Age > 30 THEN CredtRatg Good Name Icome Age Credt ratg Tom Medum <30 bad Jae Hgh <30 bad We Hgh >40 good Hua Medum [30..40] good How accurate s the model? IF Icome Hgh OR Age > 30 THEN CredtRatg Good Uversty of Alberta Uversty of Alberta

4 3. Classfcato Process (Classfcato) Improvg Accuracy Classfer New Classfer (Model) Classfer Classfer 3 Combe votes Name Icome Age Credt ratg Paul Hgh [30..40]? Credt Ratg? Classfer Composte classfer New Uversty of Alberta 3 Uversty of Alberta 4 Evaluatg Classfcato Methods Predctve accuracy Ablty of the model to correctly predct the class label Speed ad scalablty Tme to costruct the model Tme to use the model Robustess Hadlg ose ad mssg values Scalablty Effcecy large databases (ot memory resdet data) Iterpretablty: The level of uderstadg ad sght provded by the model Form of rules Decso tree sze The compactess of classfcato rules Uversty of Alberta 5 Framework (Supervsed Learg) Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Uversty of Alberta 6

5 Classfcato Methods Neural Networks Bayesa Classfcato K-Nearest Neghbour Decso Tree Iducto Assocatve Classfers Support Vector Maches Case-Based Reasog Geetc Algorthms Rough Set Theory Fuzzy Sets Etc. Uversty of Alberta 7 Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Lecture Outle Part I: Artfcal Neural Networks (ANN) Itroducto to Neural Networks Bologcal Neural System What s a artfcal eural etwork? Neuro model ad actvato fucto Costructo of a eural etwork Learg: Backpropagato Algorthm Forward propagato of sgal Backward propagato of error Example ( hour) Part II: Bayesa Classfers (Statstcal-based) ( hour) What s Bayesa Classfcato Bayes theorem Naïve Bayes Algorthm Usg Laplace Estmate Hadlg Mssg Values ad Numercal Belef Networks Uversty of Alberta 8 Huma Nervous System We have oly just bega to uderstad how our eural system operates A huge umber of euros ad tercoectos betwee them 00 bllo (.e. 0 0 ) euros the bra a full Olympc-szed swmmg pool cotas 0 0 radrops; the umber of stars the Mlky Way s of the same magtude 0 4 coectos per euro Bologcal euros are slower tha computers Neuros operate 0-3 secods, computers 0-9 secods The bra makes up for the slow rate of operato by a sgle euroe by the large umber of euros ad coectos (thk about the speed of face recogto by a huma, for example, ad the tme t takes fast computers to do the same task.) Uversty of Alberta 9 Bologcal Neuros The purpose of euros: trasmt formato the form of electrcal sgals t accepts may puts, whch are all added up some way f eough actve puts are receved at oce, the euro wll be actvated ad fre; f ot, t rema ts actve state Structure of euro Cell body - cotas ucleus holdg the chromosomes Dedrtes Axo Syapse couples the axo wth the dedrte of aother cell; formato s passed from oe euro to aother through syapses; o drect lkage across the jucto, t s a chemcal oe. Uversty of Alberta 0

6 Operato of bologcal euros Sgals are trasmtted betwee euros by electrcal pulses (acto potetals, AP) travelg alog the axo; Whe the potetal at the syapse s rased suffcetly by the AP, t releases chemcals called eurotrasmtters - t may take the arrval of more tha oe AP before the syapse s trggered The eurotrasmtters dffuse across the gap ad chemcally actvate gates o the dedrtes, that allows charged os to flow The flow of os alters the potetal of the dedrte ad provdes a voltage pulse o the dedrte (post-syaptc-potetal, PSP) some syapses excte the dedrte they affect, whle others hbt t the syapses also determe the stregth of the ew put sgal Each PSP travels alog ts dedrte ad spreads over the soma (cell body) The soma sums the effects of thousads PSPs; f the resultg potetal exceeds a threshold, the euro fres ad geerates aother AP. Uversty of Alberta What s a Artfcal Neural Network (NN)? A eural etwork s a data structure that supposedly smulates the behavour of euros a bologcal bra. A eural etwork s composed of layers of uts tercoected. Messages are passed alog the coectos from oe ut to the other. Messages ca chage based o the weght of the coecto ad the value the ode. Iput vector: x Iput odes Hdde odes Output vector Output odes feedforward Uversty of Alberta What s a Artfcal Neural Network (NN)? A etwork of may smple uts (euros, odes) The uts are coected by coectos. Each coecto has a assocated umerc weght Uts receve puts (from the evromet or other uts) va the coectos. They produce output usg ther weghts ad the puts (.e. they operate locally). A NN ca be represeted as a drected graph. NNs lear from examples ad exhbt some capablty for geeralzato beyod the trag data. kowledge s acqured by the etwork from ts evromet va learg ad s stored the weghts of the coectos. the trag (learg) rule a procedure for modfyg the weghts of coectos order to perform a certa task. There are also some sophstcated techques that allow learg by addg ad prug coectos (betwee odes). Uversty of Alberta x 0 x... x Iput vector x w 0 w... w weght vector w A Neuro weghted sum The -dmesoal put vector x s mapped to varable y by meas of the scalar product ad a olear fucto mappg. θ bas Uversty of Alberta 4 f Actvato fucto output y

7 Neuro Model Each coecto from ut to j has a umerc wegh w j assocated wth t, whch determes the stregth ad the sg of the coecto Each euro frst computes the weghed sum of ts puts w p, ad the apples a actvato fucto f to derve the output (actvato) a A euro may have a specal weght called bas weght b. NNs represet a fucto of ther weghts (parameters). By adjustg the weghts, we chage ths fucto. Ths s doe by usg a learg rule. Actvato fucto Actvato fucto, processg elemet, squashg fucto, frg rule Is appled by each euro to ts put values ad weghts (as well as the bas) S θ + Σ.. (x j * W j ) Ca be upolar [0,] bpolar [-, ] The fucto ca be Threshold or step w w R f there are puts p ad p 3, ad f w 3, w, b -.5, the a f(*3+3* -.5) f(7.5) f(p *w + p *w + b) What s f? Lear (f (S)cS), Thresholded (f (S) f S>T; 0 otherwse), a Sgmod (f (S)/(+e -cs )), a Gaussa (f (S)e -S/v ), etc. Sgmod f(s) + e -cs 0 0 f S>T f(s) 0 otherwse Gaussa f(s) e -S v 0 Uversty of Alberta 5 Uversty of Alberta 6 Correspodece Betwee Artfcal ad Bologcal Neuros How ths artfcal euro relates to the bologcal oe? put p (or put vector p) put sgal (or sgals) at the dedrte weght w (or weght vector w) - stregth of the syapse (or syapses) summer & trasfer fucto - cell body euro output a - sgal at the axo Uversty of Alberta 7 Costructg the Network The umber of put odes: Geerally correspods to the dmesoalty of the put tuples. Iput s coverted to bary ad cocateated to form a btstream. Eg. age 0-80: 6 tervals [0, 30) 00000, [30, 40) 00000,., [70, 80) [0, 30) 00, [30, 40) 00,., [70, 80) 0 Number of hdde odes: Determed by expert, or some cases, adjusted durg trag. Number of output odes: Geerally umber of classes Eg. 0 classes C, C,., C0 000 C, 000 C,., 00 C0 Uversty of Alberta 8

8 Neural Networks - Pros ad Cos Advatages predcto accuracy s geerally hgh. robust, works whe trag examples cota errors. output may be dscrete, real-valued, or a vector of several dscrete or real-valued attrbutes. fast evaluato of the leared target fucto. Crtcsm log trag tme. dffcult to uderstad the leared fucto (weghts). Typcally for umercal data ot easy to corporate doma kowledge. Desg ca be tedous ad error proe (Too small: slow learg - Too bg: stablty or poor performace) Label Trag data Learg Paradgms () Classfcato adjust weghts usg Error Desred - Actual Iputs () Reforcemet adjust weghts usg reforcemet Compare actual class wth output Actual Output Uversty of Alberta 9 Uversty of Alberta 30 Learg Algorthms Back propagato for classfcato Kohoe feature maps for clusterg Recurret back propagato for classfcato Radal bass fucto for classfcato Adaptve resoace theory Probablstc eural etworks Major Steps for Back Propagato Network Costructg a etwork put data represetato selecto of umber of layers, umber of odes each layer. Trag the etwork usg trag data Prug the etwork Iterpret the results Uversty of Alberta 3 Uversty of Alberta 3

9 Network Trag The ultmate objectve of trag obta a set of weghts that makes almost all the tuples the trag data classfed correctly. Steps: Ital weghts are set radomly. Iput tuples are fed to the etwork oe by oe. Actvato values for the hdde odes are computed. Output vector ca be computed after the actvato values of all hdde ode are avalable. Weghts are adjusted usg error (desred output - actual output) ad propagated backwards. Uversty of Alberta 33 Network Prug Fully coected etwork wll be hard to artculate put odes, h hdde odes ad m output odes lead to h(m+) lks (weghts) Prug: Remove some of the lks wthout affectg classfcato accuracy of the etwork. Uversty of Alberta 34 Backpropagato Network - Archtecture ) A etwork wth or more hdde layers output euros output euro for each class hdde euros ( hdde layer) puts put euro for each attrbute Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No ) Feedforward etwork - each euro receves put oly from the euros the prevous layer 3) Typcally fully coected - all euros a layer are coected wth all euros the ext layer 4) Weghts talzato small radom values, e.g. [-,] Uversty of Alberta 35 Backpropagato Network Archtecture 5) Neuro model - weghed sum of put sgals + dfferetable trasfer fucto a f(wp+b) ay dfferetable trasfer fucto f ca be used; most frequetly the sgmod ad ta-sgmod (hyperbolc taget sgmod) fuctos are used: a + e e e a e + e Uversty of Alberta 36

10 Archtecture Number of Iput Uts Numercal data - typcally put ut for each attrbute Categorcal data put ut for each attrbute value) How may put uts for the weather data? output layer hdde layer(s) suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No Ecodg of the put examples typcally bary depedg o the value of the attrbute (o ad off) Other possbltes are also acceptable. For e.g.: example Wdy could be coded wth oly oe ut: true or false ( or 0). Uversty of Alberta 37 Typcally euro for each class target class ex: 0 Number of Output Uts No Yes hdde layer(s) suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy ex.: Ecodg of the targets (classes) typcally bary e.g. class (o): 0, class (yes): 0 Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No suy hot hgh false No Aother possblty s to code the target class wth oly oe ut: Yes or No ( or 0). Uversty of Alberta 38 Number of Hdde Layers ad Uts Them A art! Typcally - by tral ad error The task costras the umber of puts ad output uts but ot the umber of hdde layers ad euros them Too may hdde layers ad uts (.e. too may weghts) overfttg Too few uderfttg,.e. the NN s ot able to lear the put-output mappg A heurstc to start wth: hdde layer wth hdde euros, (puts+output_euros)/ No Yes target class ex: 0 suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty Uversty of Alberta 39 wdy ex.: Propagate p forward Learg Backpropagato NNs Labeled data Idea of backpropagato learg For each trag example p Propagate p through the etwork ad calculate the output a. Compare the desred d wth the actual output a ad calculate the error; Update weghts of the etwork to reduce the error; Utl error over all examples < threshold Why backpropagato? Adjusts the weghts backwards (from the output to the put uts) by propagatg the weght chage w w ew pq suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy suy hot hgh false old pq p a No Yes w + w How to calculate the weght chage? pq d N Compare ad calculate error Propagate error adjustmets backward Uversty of Alberta 40

11 Backpropagato Learg - Sum of Squared Errors ( s a classcal measure of error E for a sgle trag example over all output euros d :desred, a :actual etwork output for output euro E e ( d a ) Thus, backpropagato learg ca be vewed as a optmzato search the weght space Goal state the set of weghts for whch the performace dex (error) s mmum Search method hll clmbg [reduce error for each trag example] Steepest Gradet Descet The drecto of the steepest descet s called gradet ad ca be computed ( E/ w ) A fucto decreases most rapdly whe the drecto of movemet s the drecto of the egatve of the gradet Hece, we wat to adjust the weghts so that the chage moves the system dow the error surface the drecto of the locally steepest descet, gve by the egatve of the gradet η- learg rate, defes the step; typcally the rage (0,) Gves the slope (gradet) of the error fucto for oe weght We wat to fd the weght where the slope (gradet) s 0 Uversty of Alberta 4 Uversty of Alberta 4 Backpropagato Algorthm - Idea The backpropagato algorthm adjust weghts by workg backward from the output layer to the put layer Calculate the error ad propagate ths error from layer to layer approaches Icremetal the weghts are adjusted after each trag example s appled Called also a approxmate steepest descet Preferred as t requres less space Batch weghts are adjusted oce after all trag examples are appled ad a total error was calculated w pq Backpropagato Rule Delta chage w pq (t) : weght from ode p to ode q at tme t w ( t + ) w ( t) + w pq pq η δ q o p pq - weght chage The weght chage s proportoal to the output actvato of euro p (e. O p ) ad the error δ of euro q (e. δ p ) δ s calculated dfferet ways: q s a output euro δq ( d q oq ) f ( etq ) q s a hdde euro δ f '( et ) w q q q δ δ q w pq op q p ( s over the odes the layer above q) Sold les - forward propagato of sgals Dashed les backward propagato of error Uversty of Alberta 43 Dervatve of the actvato fucto at euro q wth respect to the put of q (etq) Uversty of Alberta 44

12 Dervatve of Sgmod Actvato Fucto From the formulas for δ, we must be able to calculate the dervatves for f. For a sgmod trasfer fucto: f ( etm) om et + e m o etm f '( et m + e m) et et e etm ( + e etm ) m o m ( o ) m Thus, backpropagato errors for a etwork wth sgmod trasfer fucto: q s a output euro q s a hdde euro m δ q δ o ( o ) w q ( d o ) o o ) q q Uversty of Alberta 45 q q q ( q q δ δ q w pq op q p Backpropagato Algorthm - Summary. Determe the archtecture of the etwork how may put ad output euros; what output ecodg hdde euros ad layers. Italze all weghts (bases cl.) to small radom values, typcally [-,] 3. Repeat utl termato crtero satsfed: (forward pass) Preset a trag example ad propagate t through the etwork to calculate the actual output (backward pass) Compute the error (the δ values for the output euros). Startg wth output layer, repeat for each layer the etwork: - propagate the δ values back to the prevous layer - update the weghts betwee the two layers epoch - pass through the The stoppg crtera s checked at the ed of each epoch: trag set The error (mea absolute or mea square) s below a threshold All trag examples are propagated ad the total error s calculated The threshold s determed heurstcally e.g. 0.3 Maxmum umber of epochs s reached Early stoppg usg a valdato set It typcally takes hudreds or thousads of epochs for a NN to coverge Uversty of Alberta 46 Some Iterestg NN Applcatos There are may examples of applcatos usg NNs Network desg s typcally the result of several moths tral ad error expermetato Moral: NNs are wdely applcable but they caot magcally solve problems; wrog choces lead to poor performace NNs are the secod best way of dog just about aythg Joh Deker NN provde passable performace o may tasks that would be dffcult to solve explctly wth other techques Uversty of Alberta 47 Lecture Outle Part I: Artfcal Neural Networks (ANN) Itroducto to Neural Networks Bologcal Neural System What s a artfcal eural etwork? Neuro model ad actvato fucto Costructo of a eural etwork Learg: Backpropagato Algorthm Forward propagato of sgal Backward propagato of error Example ( hour) Part II: Bayesa Classfers (Statstcal-based) ( hour) What s Bayesa Classfcato Bayes theorem Naïve Bayes Algorthm Usg Laplace Estmate Hadlg Mssg Values ad Numercal Belef Networks Uversty of Alberta 48

13 What s Bayesa Learg (Classfcato)? Baysa classfers are statstcal classfers They ca predct the class membershp probablty,.e. the probablty that a gve example belogs to a partcular class. They are based o the Bayes Theorem, preseted the Essay Towards Solvg a Problem the Doctre of Chaces publshed posthumously by hs fred Rchard Prce the Phlosophcal Trasactos of the Royal Socety of Lodo 763. Thomas Bayes [70-76] More o Bayesa Classfers It uses probablstc learg by calculatg explct probabltes for hypothess. A aïve Bayesa classfer, that assumes total depedece betwee attrbutes, s commoly used for data classfcato ad learg problems. It performs well wth large data sets ad exhbts hgh accuracy. The model s cremetal the sese that each trag example ca cremetally crease or decrease the probablty that a hypothess s correct. Pror kowledge ca be combed wth observed data. Uversty of Alberta 49 Uversty of Alberta 50 Bayes Theorem Gve a data sample E (also called Evdece) wth a ukow class label, H s the hypothess that E belogs to a specfc class C. The probablty of a hypothess H, H, probablty of E codtoed o H, also called Posteror Probablty, follows the Bayes theorem: Example: Istaces of fruts, P ( H E H) H) descrbed by ther colour ad shape. Let E s red ad roud, H s the hypothess that E s a + H apple. E C H) ) + ) E H) + f ) Uversty of Alberta 5 Bayes Theorem The Frut Example E H) H) P ( H + H E C H reflects our cofdece that E s a apple gve that we have see that E s red ad roud H f ) + Called posteror, or posteror probablty, of H codtoed o E H) s the probablty that ay gve example s a apple, regardless of how t looks H) ) Called pror, or apror probablty, of H The posteror probablty s based o more formato that the apror probablty whch s depedet of E What s E H)? E H) + f ) the posteror probablty of E codtoed o H: the probablty that E s red ad roud gve that we kow that E s a apple. What s + ) the pror probablty of E: the probablty that a example from the frut data set s red ad roud Uversty of Alberta 5

14 Bayes Theorem How to use t for classfcato? P ( H E H) H) I classfcato tasks we would lke to predct the class of a ew example E. We ca do ths by: Calculatg H for each H (class) the probablty that the hypothess H s true gve the example E Comparg these probabltes ad assgg E to the class wth the hghest probablty. How to estmate, H) ad E H)? From the gve data (ths s the trag phase of the classfer) Uversty of Alberta 53 Naïve Bayes Classfer Suppose we have classes C, C,,C. Gve a ukow sample X, the classfer wll predct that X(x,x,,x m ) belogs to the class wth the hghest posteror probablty: X C f C X) > Cj X) for j, j X C ) C ) Maxmze maxmze X C )C ) X ) C ) s /s X C ) x k C ) where x k C ) s k /s k Greatly reduces the computato cost, oly cout the class dstrbuto. Naïve: class codtoal depedece Uversty of Alberta 54 Naïve Bayes Algorthm - Basc Assumpto Naïve Bayes uses all attrbutes to make a decso ad allows them to make cotrbutos to the decso that are equally mportat & depedet of oe aother Idepedece assumpto attrbutes are codtoally depedet of each other gve the class Equally mportace assumpto attrbutes are equally mportat Urealstc assumptos! t s called Naïve Bayes Are depedet of oe aother Attrbutes are ot equally mportat But these assumptos lead to a smple method whch works surprsgly well practce! Uversty of Alberta 55 Naïve Bayes (NB) for the Tes Example Cosder the tes data Suppose we ecouter a ew example whch has to be classfed: Outlook Tempreature Humdty Wdy Play suy cool hgh true?? Recall the Bayes theorem: E H ) H ) P ( H What are H & E for our example? Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No the hypothess H s that the class s PlayYes (ad there s aother hypothess: that the class s PlayNo) the evdece E s the ew example (.e. a partcular combato of observed attrbute values for the ew day) Uversty of Alberta 56

15 Naïve Bayes for the Tes Example - E H ) H ) P ( H We eed to calculate yes ad o Outlook Tempreature Humdty Wdy Play where E s & compare them suy cool hgh true?? If we deote the 4 peces of evdece outlooksuy wth wth E temperaturecool wth E humdtyhgh wth E 3 wdytrue wth E 4 ad assume that they are depedet gve the class, tha ther combed probablty s obtaed by multplcato: P ( E yes) E yes) E yes) E3 yes) E4 yes) Uversty of Alberta 57 Naïve Bayes for the Tes Example - 3 Hece E P yes yes) E yes) E3 yes) E 4 ( yes) yes) Probabltes the umerator wll be estmated from the data. There s o eed to estmate as t wll appear also the deomators of the other hypotheses,.e. t wll dsappear whe we compare them. E P o o) E o) E3 o) E 4 ( o) o) Uversty of Alberta 58 Naïve Bayes for the Tes Example cot. Tes data - couts ad probabltes: outlook temperature humdty wdy play yes o yes o yes o yes o yes o suy 3 hot hgh 3 4 false overcast 4 0 mld 4 ormal 6 true 3 3 ray 3 cool 3 suy /9 3/5 hot /9 /5 hgh 3/9 4/5 false 6/9 /5 9/4 5/4 overcast 4/9 0/5 mld 4/9 /5 ormal 6/9 /5 true 3/9 3/5 ray 3/9 /5 cool 3/9 /5 Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes proportos of days ra mld hgh false Yes ra cool ormal false Yes whe play s yes ra cool ormal true No proportos of days whe overcast cool ormal true Yes suy mld hgh false No humdty s ormal ad play s yes suy cool ormal false Yes.e. the probablty of humdty to ra mld ormal false Yes suy mld ormal true Yes be ormal gve that play s yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No Uversty of Alberta 59 Naïve Bayes for the Tes Example cot. E yes) E yes) E3 yes) E P yes P ( yes? outlook temperature humdty wdy play yes o yes o yes o yes o yes o suy 3 hot hgh 3 4 false overcast 4 0 mld 4 ormal 6 true 3 3 ray 3 cool 3 suy /9 3/5 hot /9 /5 hgh 3/9 4/5 false 6/9 /5 9/4 5/4 overcast 4/9 0/5 mld 4/9 /5 ormal 6/9 /5 true 3/9 3/5 ray 3/9 /5 cool 3/9 /5 E yes)outlooksuy yes)/9 E yes)temperaturecool yes)3/9 E 3 yes)humdtyhgh yes)3/9 E 4 yes)wdytrue yes)3/9 ( 4 yes) yes) yes)? - the probablty of a Playyes wthout kowg ay E,.e. aythg about the partcular day; the pror probablty of yes; Playyes) 9/4 Uversty of Alberta 60

16 Naïve Bayes for the Tes Example cot.3 By substtutg the respectve evdece probabltes: P ( yes Smlarly calculatg: o P ( o 55 5 Outlook Yes No Humdty Yes No suy /9 3/5 hgh 3/9 4/5 overcast 4/9 0 ormal 6/9 /5 ra 3/9 /5 Wdy Tempreature true 3/9 3/5 hot /9 /5 false 6/9 /5 mld 4/9 /5 Playyes 9/4 cool 3/9 /5 PlayNo 5/4 > P ( o > yes > for the ew day play o s more lkely tha play yes (4 tmes more lkely) Uversty of Alberta 6 A Problem wth Naïve Bayes Suppose that the trag data for the tes example was dfferet: outlooksuy had bee always assocated wth playo (.e. outlooksuy had ever occurred together wth playyes ) The yes outlooksuy)0 ad o outlooksuy) E yes) E yes) E3 yes) E4 yes) yes) P ( yes 0 > fal probablty yes 0 o matter of the other probabltes,.e. zero probabltes hold a veto over the other probabltes Ths s a problem! If t happes the trag set poor predcto o ew data Soluto: use Laplace estmator (correcto) to calculate probabltes Adds to the umerator ad k to the deomator, where k s the umber of attrbute values for a gve attrbute Uversty of Alberta 6 Laplace Correcto Modfed Tes Example outlook yes o suy 0 5 overcast 4 0 ray 3 suy 0/7 5/7 overcast 4/7 0/7 ray 3/7 /7 suy yes)0/7 overcast yes)4/7 ray yes)3/7 Laplace correcto adds to the umerator ad 3 to the deomator 0 + suy yes) overcast yes) ray yes) Esures that a attrbute value whch occurs 0 tmes wll receve a ozero (although small) probablty. Uversty of Alberta 63 Laplace Correcto Orgal Tes Example outlook yes o suy 3 overcast 4 0 ray 3 suy /9 3/5 overcast 4/9 0/5 ray 3/9 /5 + 3 suy yes) overcast yes) ray yes) suy yes)/9 overcast yes)4/9 ray yes)3/9 Uversty of Alberta 64

17 Hadlg Mssg Values Easy: Mssg value the evdece E (the ew example) - omt ths attrbute e.g. E: outlook?, temperaturecool, humdtyhgh, wdytrue the P ( yes P ( o Compare these results wth the prevous! - as oe of the fractos s mssg, the probabltes are hgher the before, but ths s ot a problem as there s a mssg fracto both cases Mssg value the trag example: do ot clude them the frequecy couts ad calculate the probabltes based o the umber of values that actually occur ad ot o the total umber of trag examples Uversty of Alberta 65 Hadlg Numerc Attrbutes umerc umercal We would lke to classfy the followg ew example: outlooksuy, temperature66, humdty90, wdytrue Q. How to calculate temperature66 yes), humdty90 yes), temperature66 o), humdty90 o)? Uversty of Alberta 66 Usg Probablty Desty Fucto By assumg that umercal values have a ormal (Gaussa) probablty dstrbuto ad usg probablty desty fucto For a ormal dstrbuto wth mea µ ad stadard devato σ, the probablty desty fucto s: f ( x) e σ π ( x µ ) σ What s the meag of the probablty desty fucto of a cotuous radom varable? Closely related to probablty but s ot exactly the probablty (e.g. the probablty that x s exactly 66 s 0) The probablty that a gve value x takes a value a small rego (betwee x- ε/ ad x + ε/ ) s ε f(x) (e.g. that probablty that x s betwee 64 ad 68 s f(x) ) Uversty of Alberta 67 Calculatg Probabltes Usg Probablty Desty Fucto (66 73) * 6. f ( temperature 66 yes) e 6. π f ( humdty 90 yes) P ( yes P ( o 5 Compare wth the categorcal tes data! >o > yes > o play Uversty of Alberta 68

18 Naïve Bayes Advatages & Dsadvatages Advatages: smple approach clear sematcs for represetg, usg ad learg probablstc kowledge requres sca of the trag data may cases outperforms more sophstcated learg methods always try the smple method frst! Dsadvatages: Whle there s oly sca, t s stll computatoally expesve sce attrbutes are treated as though they were completely depedet, the exstece of depedeces betwee attrbutes skews the learg process! Normal dstrbuto assumpto whe dealg wth umerc attrbutes (mor) restrcto dscretze the data or follow other dstrbutos Uversty of Alberta 69 Belef Network Allows class codtoal depedeces to be expressed. It has a drected acyclc graph (DAG) ad a set of codtoal probablty tables (CPT). Nodes the graph represet varables ad arcs represet probablstc depedeces. (chld depedet o paret) There s oe table for each varable X. The table cotas the codtoal dstrbuto X Parets(X)). Uversty of Alberta 70 Famly Hstory LugCacer PostveXRay Bayesa Belef Networks Example Smoker Emphysema Dyspea LC ~LC (FH, S) (FH, ~S)(~FH, S) (~FH, ~S) The codtoal probablty table for the varable LugCacer Bayesa Belef Networks Several cases of learg Bayesa belef etworks: Whe both etwork structure ad all the varables are gve the the learg s smply computg the CPT. Whe etwork structure s gve but some varables are ot kow or observable, the teratve learg s ecessary (compute gradet ls H), take steps toward gradet ad ormalze). May algorthms for learg the etwork structure exst. Bayesa Belef Networks Uversty of Alberta 7 Uversty of Alberta 7

19 Classfcato Methods Neural Networks Bayesa Classfcato K-Nearest Neghbour Decso Tree Iducto Assocatve Classfers Support Vector Maches Case-Based Reasog Geetc Algorthms Rough Set Theory Fuzzy Sets Etc. Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 73 Uversty of Alberta 74 k-nearest Neghbours (k-nn) Classfcato I k-earest-eghbour classfcato, the trag dataset s used to classfy each member of a "target" dataset. There s o model created durg a learg phase but the trag set tself. It s called a lazy-learg method. Rather tha buldg a model ad referrg to t durg the classfcato. K-NN drectly refers to the trag set for classfcato. Uversty of Alberta 75 The Smple Nearest Neghbour Approach Nearest Neghbour s very smple. The trag s othg more tha sortg the trag data ad storg t a lst. To classfy a ew etry, ths etry s compared to the lst to fd the closest record, wth value as smlar as possble to the etry to classfy (.e. earest eghbour). The class of ths record s smply assged to the ew etry. Dfferet measures of smlarty or dstace ca be used. Uversty of Alberta 76

20 Sorted trag data The Nearest Neghbour New etry The k-nearest Neghbour Approach The k-nearest Neghbour s a varato of Nearest Neghbour. Istead of lookg for oly the closest record to the etry to classfy, we look for the k records closest to t.... Fd record wth closest values Dstace fucto Class label of ew etry To assg a class label to the ew etry, from all the labels of the k earest records we take the majorty class label. Nearest Neghbour s a case of k-nearest Neghbours wth k. Uversty of Alberta 77 Uversty of Alberta 78 Sorted trag data... k Nearest Neghbours New etry Fd k records wth closest values Dstace fucto Vote Class label of ew etry Agglomeratve Nearest Neghbours Trag records are put together groups as the learg process goes o. The approach s amed agglomeratve because groups or clusters are merged durg the learg. The trag s relatvely smple: Each cluster has a ceter c ad a radus r ad the class label of ts records. Itally each record the trag set forms a cluster o ts ow. Two clusters that are close together (wth some epslo dstace of each other) ad classfy the same category are combed to buld a ew aggregate cluster: a hypersphere of a larger radus ad a ew ceter. If a cluster s ot close to ay other clusters gve epslo, t remas separate. Uversty of Alberta 79 Uversty of Alberta 80

21 Agglomeratve NN Classfcato The classfcato of a ew etry cossts of fdg the closest cluster to t ad assg t the label attached to that cluster. Agglomeratve NN has a slower trag tha NN or k-nn but has the advatage of usg less memory. There s o eed to store all the trag set but oly the ceter ad radus of each cluster. I the two extremes: f all records are far from each other they rema separate clusters Nearest Neghbour. If all pots are close to each other we ed-up wth as may clusters as we have classes. Cluster Overlap Problem Sce clusters are hyperspheres, overlap of clusters of dfferet labels are boud to happe. Clusters that grow ad overlap wth earby clusters that classfy dfferetly ca reduce accuracy. A ew etry that falls a overlap area betwee two clusters ca easly be msclassfed. Oe soluto s to hbt clusters from growg f elargg a hypersphere would geerate overlap wth a dfferet class label cluster, ad smply create ew a small cluster betwee. stead Uversty of Alberta 8 Uversty of Alberta 8 Agglomeratve Nearest Neghbours grouped trag data New etry Fd closest cluster Dstace fucto Class label of ew etry Uversty of Alberta 83 Dstace Measures The most used dstace fucto s the Euclda dstace: d( X, Y ) However, other measures are possble the Mahatta dstace: d( X, Y ) ( x y ) the Chebychev: d( X, Y ) max ( x y ) the cose measure: ( x ). y d( X, Y ) Pearso s correlato: x. y ( x ) x).( y y d( X, Y ) ( x x). ( y y) Uversty of Alberta 84 ( x y )

22 Smlarty of Categorcal Smlarty measure s the verse of dstace measure For categorcal or Bary data the dstace s: Smple match: Jaccard Coeffcet: d ( X, Y ) where Uversty of Alberta 85 ( x, y ) ( x, y ) 0 f x y; ( x, y ) ( x, y ) otherwse I Sm( X, Y ) U Attrbute Weghts Not all attrbutes have the same mportace measurg smlarty (or dstace). Attrbutes could be weghted. d No-weghted ( X, Y ) ( x y ) d( X, Y ) w *( x y ) versus weghted Euclda dstace Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 86 Atr? CL What s a Decso Tree? A decso tree s a flow-chart-lke tree structure. Iteral ode deotes a test o a attrbute Brach represets a outcome of the test All tuples brach have the same value for the tested attrbute. Leaf ode represets class label or class label dstrbuto. Atr? Atr? Atr? Atr? CL CL CL CL CL CL CL A Example from Qula s ID3 Trag set Outlook Tempreature Humdty Wdy Class suy hot hgh false N suy hot hgh true N overcast hot hgh false P ra mld hgh false P ra cool ormal false P ra cool ormal true N overcast cool ormal true P suy mld hgh false N suy cool ormal false P ra mld ormal false P suy mld ormal true P overcast mld hgh true P overcast hot ormal false P ra mld hgh true N Uversty of Alberta 87 Uversty of Alberta 88

23 N Humdty? A Sample Decso Tree suy Outlook? overcast Wdy? hgh ormal true false P P N ra Outlook Tempreature Humdty Wdy Class suy hot hgh false N suy hot hgh true N overcast hot hgh false P ra mld hgh false P ra cool ormal false P ra cool ormal true N overcast cool ormal true P suy mld hgh false N suy cool ormal false P ra mld ormal false P suy mld ormal true P overcast mld hgh true P overcast hot ormal false P ra mld hgh true N P Decso-Tree Classfcato Methods The basc top-dow decso tree geerato approach usually cossts of two phases:. Tree costructo At the start, all the trag examples are at the root. Partto examples are recursvely based o selected attrbutes.. Tree prug Amg at removg tree braches that may reflect ose the trag data ad lead to errors whe classfyg test data mprove classfcato accuracy. Uversty of Alberta 89 Uversty of Alberta 90 Decso Tree Costructo Choosg the Attrbute to Splt Set CL Atr? Recursve process: Tree starts a sgle ode represetg all data. If sample are all same class the ode becomes a leaf labeled wth class label. Otherwse, select attrbute that best separates sample to dvdual classes. Recurso stops whe: Sample ode belog to the same class (majorty); There are o remag attrbutes o whch to splt; There are o samples wth attrbute value. Uversty of Alberta 9 The measure s also called Goodess fucto Dfferet algorthms may use dfferet goodess fuctos: formato ga (ID3/C4.5) assume all attrbutes to be categorcal. ca be modfed for cotuous-valued attrbutes. g dex assume all attrbutes are cotuous-valued. assume there exst several possble splt values for each attrbute. may eed other tools, such as clusterg, to get the possble splt values. ca be modfed for categorcal attrbutes. Uversty of Alberta 9

24 Iformato Ga (ID3/C4.5) Assume that there are two classes, P ad N. Let the set of examples S cota x elemets of class P ad y elemets of class N. The amout of formato, eeded to decde f a arbtrary example S belog to P or N s defed as: x x y y I( SP, SN ) log log I( s, s x+ y x + y x + y,..., s) p log ( p x+ y I geeral Assume that usg attrbute A as the root the tree wll partto S sets {S, S,, S v }. If S cotas x examples of P ad y examples of N, the formato eeded to classfy objects all subtrees S : E( A) v x + y I( S x y + P, S N ) I geeral ) v E( A) I( s, s,..., s ) s Uversty of Alberta 93 s s s p s estmated by s /s Iformato Ga -- Example The attrbute A s selected such that the formato ga ga(a) I(S P,S N ) - E(A) s maxmal, that s, E(A) s mmal sce I(S P,S N ) s the same to all attrbutes at a ode. I the gve sample data, attrbute outlook s chose to splt at the root : ga(outlook) 0.46 ga(temperature) 0.09 ga(humdty) 0.5 ga(wdy) Iformato ga measure teds to favor attrbutes wth may values. Other possbltes: G Idex, χ, etc. Uversty of Alberta 94 G Idex If a data set S cotas examples from classes, g dex, g(s) s defed as g ( S ) p j j where p j s the relatve frequecy of class j S. If a data set S s splt to two subsets S ad S wth szes N ad N respectvely, the g dex of the splt data cotas examples from classes, the g dex g(s) s defed as N N g ( S ) g ( S ) + g ( S ) splt N N The attrbute that provdes the smallest g splt (S) s chose to splt the ode (eed to eumerate all possble splttg pots for each attrbute). Uversty of Alberta 95 Example for g Idex Suppose there two attrbutes: age ad come, ad the class label s buy ad ot buy. There are three possble splt values for age: 30, 40, 50. There are two possble splt values for come: 30K, 40K We eed to calculate the followg g dex g age 30 (S), g age 40 (S), g age 50 (S), g come 30k (S), g come 40k (S) Choose the mmal oe as the splt attrbute Uversty of Alberta 96

25 Prmary Issues Tree Costructo Splt crtero: Used to select the attrbute to be splt at a tree ode durg the tree geerato phase. Dfferet algorthms may use dfferet goodess fuctos: formato ga, g dex, etc. Brachg scheme: Determg the tree brach to whch a sample belogs. bary splttg (g dex) versus may splttg (formato ga). Stoppg decso: Whe to stop the further splttg of a ode, e.g. mpurty measure. Labelg rule: a ode s labeled as the class to whch most samples at the ode belog. Algorthm How to costruct a tree? greedy algorthm make optmal choce at each step: select the best attrbute for each tree ode. top-dow recursve dvde-ad-coquer maer from root to leaf splt ode to several braches for each brach, recursvely ru the algorthm Uversty of Alberta 97 Uversty of Alberta 98 Example for Algorthm (ID3) All attrbutes are categorcal Create a ode N; f samples are all of the same class C, the retur N as a leaf ode labeled wth C. f attrbute-lst s empty the retur N as a left ode labeled wth the most commo class. Select splt-attrbute wth hghest formato ga label N wth the splt-attrbute for each value A of splt-attrbute, grow a brach from Node N let S be the brach whch all tuples have the value A for splt- attrbute f S s empty the attach a leaf labeled wth the most commo class. Else recursvely ru the algorthm at Node S Utl all braches reach leaf odes Uversty of Alberta 99 Drectly How to use a tree? test the attrbute value of ukow sample agast the tree. A path s traced from root to a leaf whch holds the label. Idrectly decso tree s coverted to classfcato rules. oe rule s created for each path from the root to a leaf. IF-THEN rules are easer for humas to uderstad. Uversty of Alberta 00

26 Avod Over-fttg Classfcato A tree geerated may over-ft the trag examples due to ose or too small a set of trag data. Two approaches to avod over-fttg: (Stop earler): Stop growg the tree earler. (Post-prue): Allow over-ft ad the post-prue the tree. Approaches to determe the correct fal tree sze: Separate trag ad testg sets or use cross-valdato. Use all the data for trag, but apply a statstcal test (e.g., ch-square) to estmate whether expadg or prug a ode may mprove over etre dstrbuto. Use Mmum Descrpto Legth (MDL) prcple: haltg growth of the tree whe the ecodg s mmzed. Rule post-prug (C4.5): covertg to rules before prug. Cotuous ad Mssg Values Decso-Tree Iducto Dyamcally defe ew dscrete-valued attrbutes that partto the cotuous attrbute value to a dscrete set of tervals. Temperature play tes No No Yes Yes Yes No Sort the examples accordg to the cotuous attrbute A, the detfy adjacet examples that dffer ther target classfcato, geerate a set of caddate thresholds mdway, ad select the oe wth the maxmum ga. Extesble to splt cotuous attrbutes to multple tervals. Assg mssg attrbute values ether Assg the most commo value of A(x). Assg probablty to each of the possble values of A. Uversty of Alberta 0 Uversty of Alberta 0 Alteratve Measures for Selectg Attrbutes Ifo ga aturally favours attrbutes wth may values. Oe alteratve measure: ga rato (Qula 86) whch s to pealze attrbute wth may values. SpltIfo ( S, A) S log S S. S GaRato ( S, A) Ga ( S, A) ( S, A). SpltIfo Problem: deomator ca be 0 or close whch makes GaRato very large. Dstace-based measure (Lopez de Mataras 9): defe a dstace metrc betwee parttos of the data. choose the oe closest to the perfect partto. There are may other measures. Mgers 9 provdes a expermetal aalyss of effectveess of several selecto measures over a varety of problems. Uversty of Alberta 03 Tree Prug A decso tree costructed usg the trag data may have too may braches/leaf odes. Caused by ose, over-fttg. May result poor accuracy for usee samples. Prue the tree: merge a subtree to a leaf ode. Usg a set of data dfferet from the trag data. At a tree ode, f the accuracy wthout splttg s hgher tha the accuracy wth splttg, replace the subtree wth a leaf ode, label t usg the majorty class. Issues: Obtag the testg data. Crtera other tha accuracy (e.g. mmum descrpto legth). Uversty of Alberta 04

27 Prug Crtero Use a separate set of examples to evaluate the utlty of post-prug odes from the tree. CART uses cost-complexty prug. Apply a statstcal test to estmate whether expadg (or prug) a partcular ode. C4.5 uses pessmstc prug. Mmum Descrpto Legth (o test sample eeded). SLIQ ad SPRINT use MDL prug. Uversty of Alberta 05 Prug Crtero --- MDL Best bary decso tree s the oe that ca be ecoded wth the fewest umber of bts Selectg a scheme to ecode a tree Comparg varous subtrees usg the cost of ecodg The best model mmzes the cost Ecodg schema Oe bt to specfy whether a ode s a leaf (0) or a teral ode () loga bts to specfy the splttg attrbute Splttg the value for the attrbute: categorcal --- log(v-) bts umercal --- log v - Uversty of Alberta 06 Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 07 How do Assocatve Classfers Work? Trasacto ID Items Bought 000 X,Y,Z 000 X,Z 4000 X,V 5000 U,V,W Atr Atr Atr3 AtrN Class Label {Td, Item, Item, Item 3, Item t } {Td, Item, Item, Item 3, Item t } {Item, Item, Item 3, Item N, Class } {Item, Item, Item 3, Item N, Class } Costraed Assocato Rules Frequet k-temsets {Item a, Item b, Item k } Rules {Itemset Itemset} Costraed Itemsets Frequet k-temsets {Item a, Item k, Class x } {Itemset Class} Uversty of Alberta 08

28 Automatc dagostc Backgroud, Motvato ad Geeral Outle of the Proposed Project We have bee collectg tremedous amouts of formato coutg o the power of computers to help effcetly sort through ths amalgam of formato. Ufortuately, these massve collectos of data stored o dsparate dspersed meda very rapdly become overwhelmg. Regrettably, most of the collected large datasets rema uaalyzed due to lack of approprate, effectve ad scalable techques. Modelg documets {bread, mlk, beer, } (Bread, mlk) {term, term,,ca} (term, Ca) {f, f,,ca} (f3, f5, Ca) Bread mlk term Ca f3^f5 Ca Model put data to trasactos Set of trasactos <{,,, k },c> Trasactos (Trag ) Rule Geerato Assocato Rules Geeral Approach Set of rules Rule Prug Prued Rules Set of rules New object Rule Selecto Applcable Rules Ulabeled ew objects Selected Rules Also modeled to trasactos Labeled objects New object labelled Uversty of Alberta 09 Uversty of Alberta 0 Assocato Rules - Classfcato for all Categores CBA (998) Category Category Category [Apror- cofdece] Sgle class CMAR (00) [FP-Growth χ] Sgle class ARC-AC (00) [Apror cofdece vote] Mult class Assocato Rules for all Categores New objects Assocatve Classfer ARC-AC Put objects ts predcted class Assocato Rules - Classfcato by Category ARC-BC (00) Category Category Category Assocato Rules for Category Assocato Rules for Category Assocato Rules for Category New objects Assocatve Classfer ARC-BC Put objects ts predcted class Uversty of Alberta Uversty of Alberta

29 Assocato Rules: Advatages & Issues AR are well studed fast scalable No depedece assumpto btw. attrbutes Attrbutes: large umber varable umber, ca hadle mssg values Trasparecy AC are a early stage of developmet use smple rules aïve selecto fucto AC models cosst of a large umber of rules harder selecto redudat, uterestg rules loger classfcato tme dffcult to maually revst rules Soluto: Prug Techques Uversty of Alberta 3 Large umber of rules Prug Rules Nosy formato Soluto: Prug Techques Log classfcato tme Removg low raked specalzed rules; R : F C Cofdece 90% R R F F C Cofdece 80% : Elmate coflctg rules (for sgle-class classfcato); F C F C base coverage; Uversty of Alberta 4 Classfcato Stage Let S be the classfcato system A ew object O <f; f3; f4; f7; f9 > f C cofdece 0.9 f3 & f4 C cofdece 0.85 f4 C cofdece 0.8 f7 C cofdece 0.6 f9 C3 cofdece 0.5 C 0.85 C 0.75 C3 0.5 Usg the domace factor we chose the wg categores. If δ00% C s wg. If δ80% O s predcted to fall C ad C. Model put data to trasactos Set of trasactos <{,,, k },c> Rule Geerato Set of rules Summary Rule Prug Set of rules Rule Selecto Ulabeled ew objects Labeled objects Uversty of Alberta 5 Uversty of Alberta 6

30 Learg Set of trasactos <{,,, k },c> Trag Modellg trasactos to corporate more formato Rule Geerato Support thresholdfree rule geerato Ope Problems? Set of rules Assocato Rules Rule value measure Rule Prug New heurstcs ad ew prug strateges Set of rules Prued Rules Rakg rules New object Rule Selecto Applcable Rules Rule represetato Classfcato Selected Rules New heurstcs ad ew selecto strateges New object labelled What s Predcto? The goal of predcto s to forecast or deduce the value of a attrbute based o values of other attrbutes. A model s frst created based o the data dstrbuto. The model s the used to predct future or ukow values. I Mg If forecastg dscrete value Classfcato If forecastg cotuous value Predcto Uversty of Alberta 7 Uversty of Alberta 8 Predcto Predcto of cotuous values ca be modeled by statstcal techques. Lear regresso Multple regresso Polyomal regresso Posso regresso Log-lear regresso Etc. Uversty of Alberta 9 Lear Regresso Lear regresso: Approxmate data dstrbuto by a le Y α + βx Y s the respose varable ad X the predctor varable. α ad β are regresso coeffcets specfyg the tercept ad the slope of the le. They are calculated by least square method: β s s ( x x )( y ( x x ) y ) α y β x Where x ad y are respectvely the average of x, x,, x s ad y, y,,y s. Multple regresso: Y α + β X + β X. May olear fuctos ca be trasformed to the above. Uversty of Alberta 0

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

Dimensionality reduction Feature selection

Dimensionality reduction Feature selection CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

Machine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18

Machine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18 Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

Naïve Bayes MIT Course Notes Cynthia Rudin

Naïve Bayes MIT Course Notes Cynthia Rudin Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.

More information

Generative classification models

Generative classification models CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

6. Nonparametric techniques

6. Nonparametric techniques 6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set. Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Statistics MINITAB - Lab 5

Statistics MINITAB - Lab 5 Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:

More information

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis) We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the

More information

This lecture and the next. Why Sorting? Sorting Algorithms so far. Why Sorting? (2) Selection Sort. Heap Sort. Heapsort

This lecture and the next. Why Sorting? Sorting Algorithms so far. Why Sorting? (2) Selection Sort. Heap Sort. Heapsort Ths lecture ad the ext Heapsort Heap data structure ad prorty queue ADT Qucksort a popular algorthm, very fast o average Why Sortg? Whe doubt, sort oe of the prcples of algorthm desg. Sortg used as a subroute

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

(b) By independence, the probability that the string 1011 is received correctly is

(b) By independence, the probability that the string 1011 is received correctly is Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty

More information

Statistics: Unlocking the Power of Data Lock 5

Statistics: Unlocking the Power of Data Lock 5 STAT 0 Dr. Kar Lock Morga Exam 2 Grades: I- Class Multple Regresso SECTIONS 9.2, 0., 0.2 Multple explaatory varables (0.) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (0.2) Exam 2 Re- grades Re-

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK Ram Rzayev Cyberetc Isttute of the Natoal Scece Academy of Azerbaa Republc ramrza@yahoo.com Aygu Alasgarova Khazar

More information

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits Block-Based Compact hermal Modelg of Semcoductor Itegrated Crcuts Master s hess Defese Caddate: Jg Ba Commttee Members: Dr. Mg-Cheg Cheg Dr. Daqg Hou Dr. Robert Schllg July 27, 2009 Outle Itroducto Backgroud

More information

Model Fitting, RANSAC. Jana Kosecka

Model Fitting, RANSAC. Jana Kosecka Model Fttg, RANSAC Jaa Kosecka Fttg: Issues Prevous strateges Le detecto Hough trasform Smple parametrc model, two parameters m, b m + b Votg strateg Hard to geeralze to hgher dmesos a o + a + a 2 2 +

More information

L5 Polynomial / Spline Curves

L5 Polynomial / Spline Curves L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a

More information

Simulation Output Analysis

Simulation Output Analysis Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

Systematic Selection of Parameters in the development of Feedforward Artificial Neural Network Models through Conventional and Intelligent Algorithms

Systematic Selection of Parameters in the development of Feedforward Artificial Neural Network Models through Conventional and Intelligent Algorithms THALES Project No. 65/3 Systematc Selecto of Parameters the developmet of Feedforward Artfcal Neural Network Models through Covetoal ad Itellget Algorthms Research Team G.-C. Vosakos, T. Gaakaks, A. Krmpes,

More information

Chapter 13 Student Lecture Notes 13-1

Chapter 13 Student Lecture Notes 13-1 Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

CSE 5526: Introduction to Neural Networks Linear Regression

CSE 5526: Introduction to Neural Networks Linear Regression CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss

More information

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots

More information

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model 1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed

More information

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STA302/1001-Fall 2008 Midterm Test October 21, 2008 STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

Simple Linear Regression

Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method

More information

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58 Secto.. 6l 34 6h 667899 7l 44 7h Stem=Tes 8l 344 Leaf=Oes 8h 5557899 9l 3 9h 58 Ths dsplay brgs out the gap the data: There are o scores the hgh 7's. 6. a. beams cylders 9 5 8 88533 6 6 98877643 7 488

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1 STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

Chapter Statistics Background of Regression Analysis

Chapter Statistics Background of Regression Analysis Chapter 06.0 Statstcs Backgroud of Regresso Aalyss After readg ths chapter, you should be able to:. revew the statstcs backgroud eeded for learg regresso, ad. kow a bref hstory of regresso. Revew of Statstcal

More information

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

STA 105-M BASIC STATISTICS (This is a multiple choice paper.) DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal. Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Fault Diagnosis Using Feature Vectors and Fuzzy Fault Pattern Rulebase

Fault Diagnosis Using Feature Vectors and Fuzzy Fault Pattern Rulebase Fault Dagoss Usg Feature Vectors ad Fuzzy Fault Patter Rulebase Prepared by: FL Lews Updated: Wedesday, ovember 03, 004 Feature Vectors The requred puts for the dagostc models are termed the feature vectors

More information

Regression and the LMS Algorithm

Regression and the LMS Algorithm CSE 556: Itroducto to Neural Netorks Regresso ad the LMS Algorthm CSE 556: Regresso 1 Problem statemet CSE 556: Regresso Lear regresso th oe varable Gve a set of N pars of data {, d }, appromate d b a

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Chapter Two. An Introduction to Regression ( )

Chapter Two. An Introduction to Regression ( ) ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the

More information

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1 C. Statstcs a. Descrbe the stages the desg of a clcal tral, takg to accout the: research questos ad hypothess, lterature revew, statstcal advce, choce of study protocol, ethcal ssues, data collecto ad

More information

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations Lecture 7 3. Parametrc ad No-Parametrc Ucertates, Radal Bass Fuctos ad Neural Network Approxmatos he parameter estmato algorthms descrbed prevous sectos were based o the assumpto that the system ucertates

More information

Support vector machines

Support vector machines CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Towards Multi-Layer Perceptron as an Evaluator Through Randomly Generated Training Patterns

Towards Multi-Layer Perceptron as an Evaluator Through Randomly Generated Training Patterns Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) Towards Mult-Layer Perceptro as a Evaluator Through Ramly Geerated Trag

More information

4. Standard Regression Model and Spatial Dependence Tests

4. Standard Regression Model and Spatial Dependence Tests 4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.

More information

Overcoming Limitations of Sampling for Aggregation Queries

Overcoming Limitations of Sampling for Aggregation Queries CIS 6930 Approxmate Quer Processg Paper Presetato Sprg 2004 - Istructor: Dr Al Dobra Overcomg Lmtatos of Samplg for Aggregato Queres Authors: Surajt Chaudhur, Gautam Das, Maur Datar, Rajeev Motwa, ad Vvek

More information

A New Family of Transformations for Lifetime Data

A New Family of Transformations for Lifetime Data Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several

More information

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0

More information

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies ISSN 1684-8403 Joural of Statstcs Volume 15, 008, pp. 44-53 Abstract A Combato of Adaptve ad Le Itercept Samplg Applcable Agrcultural ad Evrometal Studes Azmer Kha 1 A adaptve procedure s descrbed for

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

Analysis of Variance with Weibull Data

Analysis of Variance with Weibull Data Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad

More information