Course Content. What is Classification? Chapter 4 Objectives
|
|
- Loraine Nichols
- 5 years ago
- Views:
Transcription
1 Prcples of Kowledge Dscovery Fall 007 Chapter 4: Classfcato Dr. Osmar R. Zaïae Uversty of Alberta Course Cotet Itroducto to Mg Assocato Aalyss Sequetal Patter Aalyss Classfcato ad predcto Cotrast Sets Clusterg Outler Detecto Web Mg Other topcs f tme permts (spatal data, bomedcal data, etc.) Uversty of Alberta Uversty of Alberta Chapter 4 Objectves Lear basc techques for data classfcato ad predcto. Itroduce techques such as Neural Networks, Naïve Bayesa Classfcato, k-nearest Neghbors, Decso Trees & Assocatve Classfers Realze the dfferece betwee supervsed classfcato, predcto ad usupervsed classfcato of data. Uversty of Alberta 3 What s Classfcato? The goal of data classfcato s to orgaze ad categorze data dstct classes. A model s frst created based o the data dstrbuto. The model s the used to classfy ew data. Gve the model, a class ca be predcted for ew data.? Wth classfcato, I ca predct whch bucket to put the ball, but I ca t predct the weght of the ball. 3 4 Uversty of Alberta 4
2 Applcato Classfcato Learg a Model Trag Set (labeled) Credt approval Target marketg Medcal dagoss Defectve parts detfcato maufacturg Crme zog Treatmet effectveess aalyss Etc. Classfcato Model New ulabeled data LabelgClassfcato Uversty of Alberta 5 Uversty of Alberta 6 Classfcato s a three-step process Classfcato s a three-step process. Model costructo (Learg): Each tuple s assumed to belog to a predefed class, as determed by oe of the attrbutes, called the class label. The set of all tuples used for costructo of the model s called trag set. The model s represeted the followg forms: Classfcato rules, (IF-THEN statemets), Decso tree Mathematcal formulae. Model Evaluato (Accuracy): Estmate accuracy rate of the model based o a test set. The kow label of test sample s compared wth the classfed result from the model. Accuracy rate s the percetage of test set samples that are correctly classfed by the model. Test set s depedet of trag set otherwse over-fttg wll occur. Uversty of Alberta 7 Uversty of Alberta 8
3 Classfcato s a three-step process Classfcato wth Holdout 3. Model Use (Classfcato): The model s used to classfy usee objects. Gve a class label to a ew tuple Predct the value of a actual attrbute Trag Testg Derve Classfer (Model) Estmate Accuracy Holdout (partto D two; oe for test oe for trag) Radom sub-samplg (repeated holdouts wth dfferet parttog) K-fold cross valdato (K parttos of D; do k expermets each wth a dfferet test partto. No overlap as Radom sub-samplg. Typcally k0) Bootstrappg (N samples (typcally of sze 63% of D) wth replacemets) Leave-oe-out ( D expermets wth D - for trag ad oe for test) Uversty of Alberta 9 Uversty of Alberta 0. Classfcato Process (Learg). Classfcato Process (Accuracy Evaluato) Trag Classfcato Algorthms Testg Classfer (Model) Name Icome Age Credt ratg Bruce Low <30 bad Dave Medum [30..40] good Wllam Hgh <30 good Mare Medum >40 good Ae Low [30..40] good Chrs Medum <30 bad Classfer (Model) IF Icome Hgh OR Age > 30 THEN CredtRatg Good Name Icome Age Credt ratg Tom Medum <30 bad Jae Hgh <30 bad We Hgh >40 good Hua Medum [30..40] good How accurate s the model? IF Icome Hgh OR Age > 30 THEN CredtRatg Good Uversty of Alberta Uversty of Alberta
4 3. Classfcato Process (Classfcato) Improvg Accuracy Classfer New Classfer (Model) Classfer Classfer 3 Combe votes Name Icome Age Credt ratg Paul Hgh [30..40]? Credt Ratg? Classfer Composte classfer New Uversty of Alberta 3 Uversty of Alberta 4 Evaluatg Classfcato Methods Predctve accuracy Ablty of the model to correctly predct the class label Speed ad scalablty Tme to costruct the model Tme to use the model Robustess Hadlg ose ad mssg values Scalablty Effcecy large databases (ot memory resdet data) Iterpretablty: The level of uderstadg ad sght provded by the model Form of rules Decso tree sze The compactess of classfcato rules Uversty of Alberta 5 Framework (Supervsed Learg) Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Uversty of Alberta 6
5 Classfcato Methods Neural Networks Bayesa Classfcato K-Nearest Neghbour Decso Tree Iducto Assocatve Classfers Support Vector Maches Case-Based Reasog Geetc Algorthms Rough Set Theory Fuzzy Sets Etc. Uversty of Alberta 7 Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Lecture Outle Part I: Artfcal Neural Networks (ANN) Itroducto to Neural Networks Bologcal Neural System What s a artfcal eural etwork? Neuro model ad actvato fucto Costructo of a eural etwork Learg: Backpropagato Algorthm Forward propagato of sgal Backward propagato of error Example ( hour) Part II: Bayesa Classfers (Statstcal-based) ( hour) What s Bayesa Classfcato Bayes theorem Naïve Bayes Algorthm Usg Laplace Estmate Hadlg Mssg Values ad Numercal Belef Networks Uversty of Alberta 8 Huma Nervous System We have oly just bega to uderstad how our eural system operates A huge umber of euros ad tercoectos betwee them 00 bllo (.e. 0 0 ) euros the bra a full Olympc-szed swmmg pool cotas 0 0 radrops; the umber of stars the Mlky Way s of the same magtude 0 4 coectos per euro Bologcal euros are slower tha computers Neuros operate 0-3 secods, computers 0-9 secods The bra makes up for the slow rate of operato by a sgle euroe by the large umber of euros ad coectos (thk about the speed of face recogto by a huma, for example, ad the tme t takes fast computers to do the same task.) Uversty of Alberta 9 Bologcal Neuros The purpose of euros: trasmt formato the form of electrcal sgals t accepts may puts, whch are all added up some way f eough actve puts are receved at oce, the euro wll be actvated ad fre; f ot, t rema ts actve state Structure of euro Cell body - cotas ucleus holdg the chromosomes Dedrtes Axo Syapse couples the axo wth the dedrte of aother cell; formato s passed from oe euro to aother through syapses; o drect lkage across the jucto, t s a chemcal oe. Uversty of Alberta 0
6 Operato of bologcal euros Sgals are trasmtted betwee euros by electrcal pulses (acto potetals, AP) travelg alog the axo; Whe the potetal at the syapse s rased suffcetly by the AP, t releases chemcals called eurotrasmtters - t may take the arrval of more tha oe AP before the syapse s trggered The eurotrasmtters dffuse across the gap ad chemcally actvate gates o the dedrtes, that allows charged os to flow The flow of os alters the potetal of the dedrte ad provdes a voltage pulse o the dedrte (post-syaptc-potetal, PSP) some syapses excte the dedrte they affect, whle others hbt t the syapses also determe the stregth of the ew put sgal Each PSP travels alog ts dedrte ad spreads over the soma (cell body) The soma sums the effects of thousads PSPs; f the resultg potetal exceeds a threshold, the euro fres ad geerates aother AP. Uversty of Alberta What s a Artfcal Neural Network (NN)? A eural etwork s a data structure that supposedly smulates the behavour of euros a bologcal bra. A eural etwork s composed of layers of uts tercoected. Messages are passed alog the coectos from oe ut to the other. Messages ca chage based o the weght of the coecto ad the value the ode. Iput vector: x Iput odes Hdde odes Output vector Output odes feedforward Uversty of Alberta What s a Artfcal Neural Network (NN)? A etwork of may smple uts (euros, odes) The uts are coected by coectos. Each coecto has a assocated umerc weght Uts receve puts (from the evromet or other uts) va the coectos. They produce output usg ther weghts ad the puts (.e. they operate locally). A NN ca be represeted as a drected graph. NNs lear from examples ad exhbt some capablty for geeralzato beyod the trag data. kowledge s acqured by the etwork from ts evromet va learg ad s stored the weghts of the coectos. the trag (learg) rule a procedure for modfyg the weghts of coectos order to perform a certa task. There are also some sophstcated techques that allow learg by addg ad prug coectos (betwee odes). Uversty of Alberta x 0 x... x Iput vector x w 0 w... w weght vector w A Neuro weghted sum The -dmesoal put vector x s mapped to varable y by meas of the scalar product ad a olear fucto mappg. θ bas Uversty of Alberta 4 f Actvato fucto output y
7 Neuro Model Each coecto from ut to j has a umerc wegh w j assocated wth t, whch determes the stregth ad the sg of the coecto Each euro frst computes the weghed sum of ts puts w p, ad the apples a actvato fucto f to derve the output (actvato) a A euro may have a specal weght called bas weght b. NNs represet a fucto of ther weghts (parameters). By adjustg the weghts, we chage ths fucto. Ths s doe by usg a learg rule. Actvato fucto Actvato fucto, processg elemet, squashg fucto, frg rule Is appled by each euro to ts put values ad weghts (as well as the bas) S θ + Σ.. (x j * W j ) Ca be upolar [0,] bpolar [-, ] The fucto ca be Threshold or step w w R f there are puts p ad p 3, ad f w 3, w, b -.5, the a f(*3+3* -.5) f(7.5) f(p *w + p *w + b) What s f? Lear (f (S)cS), Thresholded (f (S) f S>T; 0 otherwse), a Sgmod (f (S)/(+e -cs )), a Gaussa (f (S)e -S/v ), etc. Sgmod f(s) + e -cs 0 0 f S>T f(s) 0 otherwse Gaussa f(s) e -S v 0 Uversty of Alberta 5 Uversty of Alberta 6 Correspodece Betwee Artfcal ad Bologcal Neuros How ths artfcal euro relates to the bologcal oe? put p (or put vector p) put sgal (or sgals) at the dedrte weght w (or weght vector w) - stregth of the syapse (or syapses) summer & trasfer fucto - cell body euro output a - sgal at the axo Uversty of Alberta 7 Costructg the Network The umber of put odes: Geerally correspods to the dmesoalty of the put tuples. Iput s coverted to bary ad cocateated to form a btstream. Eg. age 0-80: 6 tervals [0, 30) 00000, [30, 40) 00000,., [70, 80) [0, 30) 00, [30, 40) 00,., [70, 80) 0 Number of hdde odes: Determed by expert, or some cases, adjusted durg trag. Number of output odes: Geerally umber of classes Eg. 0 classes C, C,., C0 000 C, 000 C,., 00 C0 Uversty of Alberta 8
8 Neural Networks - Pros ad Cos Advatages predcto accuracy s geerally hgh. robust, works whe trag examples cota errors. output may be dscrete, real-valued, or a vector of several dscrete or real-valued attrbutes. fast evaluato of the leared target fucto. Crtcsm log trag tme. dffcult to uderstad the leared fucto (weghts). Typcally for umercal data ot easy to corporate doma kowledge. Desg ca be tedous ad error proe (Too small: slow learg - Too bg: stablty or poor performace) Label Trag data Learg Paradgms () Classfcato adjust weghts usg Error Desred - Actual Iputs () Reforcemet adjust weghts usg reforcemet Compare actual class wth output Actual Output Uversty of Alberta 9 Uversty of Alberta 30 Learg Algorthms Back propagato for classfcato Kohoe feature maps for clusterg Recurret back propagato for classfcato Radal bass fucto for classfcato Adaptve resoace theory Probablstc eural etworks Major Steps for Back Propagato Network Costructg a etwork put data represetato selecto of umber of layers, umber of odes each layer. Trag the etwork usg trag data Prug the etwork Iterpret the results Uversty of Alberta 3 Uversty of Alberta 3
9 Network Trag The ultmate objectve of trag obta a set of weghts that makes almost all the tuples the trag data classfed correctly. Steps: Ital weghts are set radomly. Iput tuples are fed to the etwork oe by oe. Actvato values for the hdde odes are computed. Output vector ca be computed after the actvato values of all hdde ode are avalable. Weghts are adjusted usg error (desred output - actual output) ad propagated backwards. Uversty of Alberta 33 Network Prug Fully coected etwork wll be hard to artculate put odes, h hdde odes ad m output odes lead to h(m+) lks (weghts) Prug: Remove some of the lks wthout affectg classfcato accuracy of the etwork. Uversty of Alberta 34 Backpropagato Network - Archtecture ) A etwork wth or more hdde layers output euros output euro for each class hdde euros ( hdde layer) puts put euro for each attrbute Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No ) Feedforward etwork - each euro receves put oly from the euros the prevous layer 3) Typcally fully coected - all euros a layer are coected wth all euros the ext layer 4) Weghts talzato small radom values, e.g. [-,] Uversty of Alberta 35 Backpropagato Network Archtecture 5) Neuro model - weghed sum of put sgals + dfferetable trasfer fucto a f(wp+b) ay dfferetable trasfer fucto f ca be used; most frequetly the sgmod ad ta-sgmod (hyperbolc taget sgmod) fuctos are used: a + e e e a e + e Uversty of Alberta 36
10 Archtecture Number of Iput Uts Numercal data - typcally put ut for each attrbute Categorcal data put ut for each attrbute value) How may put uts for the weather data? output layer hdde layer(s) suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No Ecodg of the put examples typcally bary depedg o the value of the attrbute (o ad off) Other possbltes are also acceptable. For e.g.: example Wdy could be coded wth oly oe ut: true or false ( or 0). Uversty of Alberta 37 Typcally euro for each class target class ex: 0 Number of Output Uts No Yes hdde layer(s) suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy ex.: Ecodg of the targets (classes) typcally bary e.g. class (o): 0, class (yes): 0 Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No suy hot hgh false No Aother possblty s to code the target class wth oly oe ut: Yes or No ( or 0). Uversty of Alberta 38 Number of Hdde Layers ad Uts Them A art! Typcally - by tral ad error The task costras the umber of puts ad output uts but ot the umber of hdde layers ad euros them Too may hdde layers ad uts (.e. too may weghts) overfttg Too few uderfttg,.e. the NN s ot able to lear the put-output mappg A heurstc to start wth: hdde layer wth hdde euros, (puts+output_euros)/ No Yes target class ex: 0 suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty Uversty of Alberta 39 wdy ex.: Propagate p forward Learg Backpropagato NNs Labeled data Idea of backpropagato learg For each trag example p Propagate p through the etwork ad calculate the output a. Compare the desred d wth the actual output a ad calculate the error; Update weghts of the etwork to reduce the error; Utl error over all examples < threshold Why backpropagato? Adjusts the weghts backwards (from the output to the put uts) by propagatg the weght chage w w ew pq suy overcast ray hot mld cool hgh ormal false true outlook temperature humdty wdy suy hot hgh false old pq p a No Yes w + w How to calculate the weght chage? pq d N Compare ad calculate error Propagate error adjustmets backward Uversty of Alberta 40
11 Backpropagato Learg - Sum of Squared Errors ( s a classcal measure of error E for a sgle trag example over all output euros d :desred, a :actual etwork output for output euro E e ( d a ) Thus, backpropagato learg ca be vewed as a optmzato search the weght space Goal state the set of weghts for whch the performace dex (error) s mmum Search method hll clmbg [reduce error for each trag example] Steepest Gradet Descet The drecto of the steepest descet s called gradet ad ca be computed ( E/ w ) A fucto decreases most rapdly whe the drecto of movemet s the drecto of the egatve of the gradet Hece, we wat to adjust the weghts so that the chage moves the system dow the error surface the drecto of the locally steepest descet, gve by the egatve of the gradet η- learg rate, defes the step; typcally the rage (0,) Gves the slope (gradet) of the error fucto for oe weght We wat to fd the weght where the slope (gradet) s 0 Uversty of Alberta 4 Uversty of Alberta 4 Backpropagato Algorthm - Idea The backpropagato algorthm adjust weghts by workg backward from the output layer to the put layer Calculate the error ad propagate ths error from layer to layer approaches Icremetal the weghts are adjusted after each trag example s appled Called also a approxmate steepest descet Preferred as t requres less space Batch weghts are adjusted oce after all trag examples are appled ad a total error was calculated w pq Backpropagato Rule Delta chage w pq (t) : weght from ode p to ode q at tme t w ( t + ) w ( t) + w pq pq η δ q o p pq - weght chage The weght chage s proportoal to the output actvato of euro p (e. O p ) ad the error δ of euro q (e. δ p ) δ s calculated dfferet ways: q s a output euro δq ( d q oq ) f ( etq ) q s a hdde euro δ f '( et ) w q q q δ δ q w pq op q p ( s over the odes the layer above q) Sold les - forward propagato of sgals Dashed les backward propagato of error Uversty of Alberta 43 Dervatve of the actvato fucto at euro q wth respect to the put of q (etq) Uversty of Alberta 44
12 Dervatve of Sgmod Actvato Fucto From the formulas for δ, we must be able to calculate the dervatves for f. For a sgmod trasfer fucto: f ( etm) om et + e m o etm f '( et m + e m) et et e etm ( + e etm ) m o m ( o ) m Thus, backpropagato errors for a etwork wth sgmod trasfer fucto: q s a output euro q s a hdde euro m δ q δ o ( o ) w q ( d o ) o o ) q q Uversty of Alberta 45 q q q ( q q δ δ q w pq op q p Backpropagato Algorthm - Summary. Determe the archtecture of the etwork how may put ad output euros; what output ecodg hdde euros ad layers. Italze all weghts (bases cl.) to small radom values, typcally [-,] 3. Repeat utl termato crtero satsfed: (forward pass) Preset a trag example ad propagate t through the etwork to calculate the actual output (backward pass) Compute the error (the δ values for the output euros). Startg wth output layer, repeat for each layer the etwork: - propagate the δ values back to the prevous layer - update the weghts betwee the two layers epoch - pass through the The stoppg crtera s checked at the ed of each epoch: trag set The error (mea absolute or mea square) s below a threshold All trag examples are propagated ad the total error s calculated The threshold s determed heurstcally e.g. 0.3 Maxmum umber of epochs s reached Early stoppg usg a valdato set It typcally takes hudreds or thousads of epochs for a NN to coverge Uversty of Alberta 46 Some Iterestg NN Applcatos There are may examples of applcatos usg NNs Network desg s typcally the result of several moths tral ad error expermetato Moral: NNs are wdely applcable but they caot magcally solve problems; wrog choces lead to poor performace NNs are the secod best way of dog just about aythg Joh Deker NN provde passable performace o may tasks that would be dffcult to solve explctly wth other techques Uversty of Alberta 47 Lecture Outle Part I: Artfcal Neural Networks (ANN) Itroducto to Neural Networks Bologcal Neural System What s a artfcal eural etwork? Neuro model ad actvato fucto Costructo of a eural etwork Learg: Backpropagato Algorthm Forward propagato of sgal Backward propagato of error Example ( hour) Part II: Bayesa Classfers (Statstcal-based) ( hour) What s Bayesa Classfcato Bayes theorem Naïve Bayes Algorthm Usg Laplace Estmate Hadlg Mssg Values ad Numercal Belef Networks Uversty of Alberta 48
13 What s Bayesa Learg (Classfcato)? Baysa classfers are statstcal classfers They ca predct the class membershp probablty,.e. the probablty that a gve example belogs to a partcular class. They are based o the Bayes Theorem, preseted the Essay Towards Solvg a Problem the Doctre of Chaces publshed posthumously by hs fred Rchard Prce the Phlosophcal Trasactos of the Royal Socety of Lodo 763. Thomas Bayes [70-76] More o Bayesa Classfers It uses probablstc learg by calculatg explct probabltes for hypothess. A aïve Bayesa classfer, that assumes total depedece betwee attrbutes, s commoly used for data classfcato ad learg problems. It performs well wth large data sets ad exhbts hgh accuracy. The model s cremetal the sese that each trag example ca cremetally crease or decrease the probablty that a hypothess s correct. Pror kowledge ca be combed wth observed data. Uversty of Alberta 49 Uversty of Alberta 50 Bayes Theorem Gve a data sample E (also called Evdece) wth a ukow class label, H s the hypothess that E belogs to a specfc class C. The probablty of a hypothess H, H, probablty of E codtoed o H, also called Posteror Probablty, follows the Bayes theorem: Example: Istaces of fruts, P ( H E H) H) descrbed by ther colour ad shape. Let E s red ad roud, H s the hypothess that E s a + H apple. E C H) ) + ) E H) + f ) Uversty of Alberta 5 Bayes Theorem The Frut Example E H) H) P ( H + H E C H reflects our cofdece that E s a apple gve that we have see that E s red ad roud H f ) + Called posteror, or posteror probablty, of H codtoed o E H) s the probablty that ay gve example s a apple, regardless of how t looks H) ) Called pror, or apror probablty, of H The posteror probablty s based o more formato that the apror probablty whch s depedet of E What s E H)? E H) + f ) the posteror probablty of E codtoed o H: the probablty that E s red ad roud gve that we kow that E s a apple. What s + ) the pror probablty of E: the probablty that a example from the frut data set s red ad roud Uversty of Alberta 5
14 Bayes Theorem How to use t for classfcato? P ( H E H) H) I classfcato tasks we would lke to predct the class of a ew example E. We ca do ths by: Calculatg H for each H (class) the probablty that the hypothess H s true gve the example E Comparg these probabltes ad assgg E to the class wth the hghest probablty. How to estmate, H) ad E H)? From the gve data (ths s the trag phase of the classfer) Uversty of Alberta 53 Naïve Bayes Classfer Suppose we have classes C, C,,C. Gve a ukow sample X, the classfer wll predct that X(x,x,,x m ) belogs to the class wth the hghest posteror probablty: X C f C X) > Cj X) for j, j X C ) C ) Maxmze maxmze X C )C ) X ) C ) s /s X C ) x k C ) where x k C ) s k /s k Greatly reduces the computato cost, oly cout the class dstrbuto. Naïve: class codtoal depedece Uversty of Alberta 54 Naïve Bayes Algorthm - Basc Assumpto Naïve Bayes uses all attrbutes to make a decso ad allows them to make cotrbutos to the decso that are equally mportat & depedet of oe aother Idepedece assumpto attrbutes are codtoally depedet of each other gve the class Equally mportace assumpto attrbutes are equally mportat Urealstc assumptos! t s called Naïve Bayes Are depedet of oe aother Attrbutes are ot equally mportat But these assumptos lead to a smple method whch works surprsgly well practce! Uversty of Alberta 55 Naïve Bayes (NB) for the Tes Example Cosder the tes data Suppose we ecouter a ew example whch has to be classfed: Outlook Tempreature Humdty Wdy Play suy cool hgh true?? Recall the Bayes theorem: E H ) H ) P ( H What are H & E for our example? Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes ra mld hgh false Yes ra cool ormal false Yes ra cool ormal true No overcast cool ormal true Yes suy mld hgh false No suy cool ormal false Yes ra mld ormal false Yes suy mld ormal true Yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No the hypothess H s that the class s PlayYes (ad there s aother hypothess: that the class s PlayNo) the evdece E s the ew example (.e. a partcular combato of observed attrbute values for the ew day) Uversty of Alberta 56
15 Naïve Bayes for the Tes Example - E H ) H ) P ( H We eed to calculate yes ad o Outlook Tempreature Humdty Wdy Play where E s & compare them suy cool hgh true?? If we deote the 4 peces of evdece outlooksuy wth wth E temperaturecool wth E humdtyhgh wth E 3 wdytrue wth E 4 ad assume that they are depedet gve the class, tha ther combed probablty s obtaed by multplcato: P ( E yes) E yes) E yes) E3 yes) E4 yes) Uversty of Alberta 57 Naïve Bayes for the Tes Example - 3 Hece E P yes yes) E yes) E3 yes) E 4 ( yes) yes) Probabltes the umerator wll be estmated from the data. There s o eed to estmate as t wll appear also the deomators of the other hypotheses,.e. t wll dsappear whe we compare them. E P o o) E o) E3 o) E 4 ( o) o) Uversty of Alberta 58 Naïve Bayes for the Tes Example cot. Tes data - couts ad probabltes: outlook temperature humdty wdy play yes o yes o yes o yes o yes o suy 3 hot hgh 3 4 false overcast 4 0 mld 4 ormal 6 true 3 3 ray 3 cool 3 suy /9 3/5 hot /9 /5 hgh 3/9 4/5 false 6/9 /5 9/4 5/4 overcast 4/9 0/5 mld 4/9 /5 ormal 6/9 /5 true 3/9 3/5 ray 3/9 /5 cool 3/9 /5 Outlook Tempreature Humdty Wdy Play suy hot hgh false No suy hot hgh true No overcast hot hgh false Yes proportos of days ra mld hgh false Yes ra cool ormal false Yes whe play s yes ra cool ormal true No proportos of days whe overcast cool ormal true Yes suy mld hgh false No humdty s ormal ad play s yes suy cool ormal false Yes.e. the probablty of humdty to ra mld ormal false Yes suy mld ormal true Yes be ormal gve that play s yes overcast mld hgh true Yes overcast hot ormal false Yes ra mld hgh true No Uversty of Alberta 59 Naïve Bayes for the Tes Example cot. E yes) E yes) E3 yes) E P yes P ( yes? outlook temperature humdty wdy play yes o yes o yes o yes o yes o suy 3 hot hgh 3 4 false overcast 4 0 mld 4 ormal 6 true 3 3 ray 3 cool 3 suy /9 3/5 hot /9 /5 hgh 3/9 4/5 false 6/9 /5 9/4 5/4 overcast 4/9 0/5 mld 4/9 /5 ormal 6/9 /5 true 3/9 3/5 ray 3/9 /5 cool 3/9 /5 E yes)outlooksuy yes)/9 E yes)temperaturecool yes)3/9 E 3 yes)humdtyhgh yes)3/9 E 4 yes)wdytrue yes)3/9 ( 4 yes) yes) yes)? - the probablty of a Playyes wthout kowg ay E,.e. aythg about the partcular day; the pror probablty of yes; Playyes) 9/4 Uversty of Alberta 60
16 Naïve Bayes for the Tes Example cot.3 By substtutg the respectve evdece probabltes: P ( yes Smlarly calculatg: o P ( o 55 5 Outlook Yes No Humdty Yes No suy /9 3/5 hgh 3/9 4/5 overcast 4/9 0 ormal 6/9 /5 ra 3/9 /5 Wdy Tempreature true 3/9 3/5 hot /9 /5 false 6/9 /5 mld 4/9 /5 Playyes 9/4 cool 3/9 /5 PlayNo 5/4 > P ( o > yes > for the ew day play o s more lkely tha play yes (4 tmes more lkely) Uversty of Alberta 6 A Problem wth Naïve Bayes Suppose that the trag data for the tes example was dfferet: outlooksuy had bee always assocated wth playo (.e. outlooksuy had ever occurred together wth playyes ) The yes outlooksuy)0 ad o outlooksuy) E yes) E yes) E3 yes) E4 yes) yes) P ( yes 0 > fal probablty yes 0 o matter of the other probabltes,.e. zero probabltes hold a veto over the other probabltes Ths s a problem! If t happes the trag set poor predcto o ew data Soluto: use Laplace estmator (correcto) to calculate probabltes Adds to the umerator ad k to the deomator, where k s the umber of attrbute values for a gve attrbute Uversty of Alberta 6 Laplace Correcto Modfed Tes Example outlook yes o suy 0 5 overcast 4 0 ray 3 suy 0/7 5/7 overcast 4/7 0/7 ray 3/7 /7 suy yes)0/7 overcast yes)4/7 ray yes)3/7 Laplace correcto adds to the umerator ad 3 to the deomator 0 + suy yes) overcast yes) ray yes) Esures that a attrbute value whch occurs 0 tmes wll receve a ozero (although small) probablty. Uversty of Alberta 63 Laplace Correcto Orgal Tes Example outlook yes o suy 3 overcast 4 0 ray 3 suy /9 3/5 overcast 4/9 0/5 ray 3/9 /5 + 3 suy yes) overcast yes) ray yes) suy yes)/9 overcast yes)4/9 ray yes)3/9 Uversty of Alberta 64
17 Hadlg Mssg Values Easy: Mssg value the evdece E (the ew example) - omt ths attrbute e.g. E: outlook?, temperaturecool, humdtyhgh, wdytrue the P ( yes P ( o Compare these results wth the prevous! - as oe of the fractos s mssg, the probabltes are hgher the before, but ths s ot a problem as there s a mssg fracto both cases Mssg value the trag example: do ot clude them the frequecy couts ad calculate the probabltes based o the umber of values that actually occur ad ot o the total umber of trag examples Uversty of Alberta 65 Hadlg Numerc Attrbutes umerc umercal We would lke to classfy the followg ew example: outlooksuy, temperature66, humdty90, wdytrue Q. How to calculate temperature66 yes), humdty90 yes), temperature66 o), humdty90 o)? Uversty of Alberta 66 Usg Probablty Desty Fucto By assumg that umercal values have a ormal (Gaussa) probablty dstrbuto ad usg probablty desty fucto For a ormal dstrbuto wth mea µ ad stadard devato σ, the probablty desty fucto s: f ( x) e σ π ( x µ ) σ What s the meag of the probablty desty fucto of a cotuous radom varable? Closely related to probablty but s ot exactly the probablty (e.g. the probablty that x s exactly 66 s 0) The probablty that a gve value x takes a value a small rego (betwee x- ε/ ad x + ε/ ) s ε f(x) (e.g. that probablty that x s betwee 64 ad 68 s f(x) ) Uversty of Alberta 67 Calculatg Probabltes Usg Probablty Desty Fucto (66 73) * 6. f ( temperature 66 yes) e 6. π f ( humdty 90 yes) P ( yes P ( o 5 Compare wth the categorcal tes data! >o > yes > o play Uversty of Alberta 68
18 Naïve Bayes Advatages & Dsadvatages Advatages: smple approach clear sematcs for represetg, usg ad learg probablstc kowledge requres sca of the trag data may cases outperforms more sophstcated learg methods always try the smple method frst! Dsadvatages: Whle there s oly sca, t s stll computatoally expesve sce attrbutes are treated as though they were completely depedet, the exstece of depedeces betwee attrbutes skews the learg process! Normal dstrbuto assumpto whe dealg wth umerc attrbutes (mor) restrcto dscretze the data or follow other dstrbutos Uversty of Alberta 69 Belef Network Allows class codtoal depedeces to be expressed. It has a drected acyclc graph (DAG) ad a set of codtoal probablty tables (CPT). Nodes the graph represet varables ad arcs represet probablstc depedeces. (chld depedet o paret) There s oe table for each varable X. The table cotas the codtoal dstrbuto X Parets(X)). Uversty of Alberta 70 Famly Hstory LugCacer PostveXRay Bayesa Belef Networks Example Smoker Emphysema Dyspea LC ~LC (FH, S) (FH, ~S)(~FH, S) (~FH, ~S) The codtoal probablty table for the varable LugCacer Bayesa Belef Networks Several cases of learg Bayesa belef etworks: Whe both etwork structure ad all the varables are gve the the learg s smply computg the CPT. Whe etwork structure s gve but some varables are ot kow or observable, the teratve learg s ecessary (compute gradet ls H), take steps toward gradet ad ormalze). May algorthms for learg the etwork structure exst. Bayesa Belef Networks Uversty of Alberta 7 Uversty of Alberta 7
19 Classfcato Methods Neural Networks Bayesa Classfcato K-Nearest Neghbour Decso Tree Iducto Assocatve Classfers Support Vector Maches Case-Based Reasog Geetc Algorthms Rough Set Theory Fuzzy Sets Etc. Labeled Trag Testg Derve Classfer (Model) Ulabeled New Estmate Accuracy Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 73 Uversty of Alberta 74 k-nearest Neghbours (k-nn) Classfcato I k-earest-eghbour classfcato, the trag dataset s used to classfy each member of a "target" dataset. There s o model created durg a learg phase but the trag set tself. It s called a lazy-learg method. Rather tha buldg a model ad referrg to t durg the classfcato. K-NN drectly refers to the trag set for classfcato. Uversty of Alberta 75 The Smple Nearest Neghbour Approach Nearest Neghbour s very smple. The trag s othg more tha sortg the trag data ad storg t a lst. To classfy a ew etry, ths etry s compared to the lst to fd the closest record, wth value as smlar as possble to the etry to classfy (.e. earest eghbour). The class of ths record s smply assged to the ew etry. Dfferet measures of smlarty or dstace ca be used. Uversty of Alberta 76
20 Sorted trag data The Nearest Neghbour New etry The k-nearest Neghbour Approach The k-nearest Neghbour s a varato of Nearest Neghbour. Istead of lookg for oly the closest record to the etry to classfy, we look for the k records closest to t.... Fd record wth closest values Dstace fucto Class label of ew etry To assg a class label to the ew etry, from all the labels of the k earest records we take the majorty class label. Nearest Neghbour s a case of k-nearest Neghbours wth k. Uversty of Alberta 77 Uversty of Alberta 78 Sorted trag data... k Nearest Neghbours New etry Fd k records wth closest values Dstace fucto Vote Class label of ew etry Agglomeratve Nearest Neghbours Trag records are put together groups as the learg process goes o. The approach s amed agglomeratve because groups or clusters are merged durg the learg. The trag s relatvely smple: Each cluster has a ceter c ad a radus r ad the class label of ts records. Itally each record the trag set forms a cluster o ts ow. Two clusters that are close together (wth some epslo dstace of each other) ad classfy the same category are combed to buld a ew aggregate cluster: a hypersphere of a larger radus ad a ew ceter. If a cluster s ot close to ay other clusters gve epslo, t remas separate. Uversty of Alberta 79 Uversty of Alberta 80
21 Agglomeratve NN Classfcato The classfcato of a ew etry cossts of fdg the closest cluster to t ad assg t the label attached to that cluster. Agglomeratve NN has a slower trag tha NN or k-nn but has the advatage of usg less memory. There s o eed to store all the trag set but oly the ceter ad radus of each cluster. I the two extremes: f all records are far from each other they rema separate clusters Nearest Neghbour. If all pots are close to each other we ed-up wth as may clusters as we have classes. Cluster Overlap Problem Sce clusters are hyperspheres, overlap of clusters of dfferet labels are boud to happe. Clusters that grow ad overlap wth earby clusters that classfy dfferetly ca reduce accuracy. A ew etry that falls a overlap area betwee two clusters ca easly be msclassfed. Oe soluto s to hbt clusters from growg f elargg a hypersphere would geerate overlap wth a dfferet class label cluster, ad smply create ew a small cluster betwee. stead Uversty of Alberta 8 Uversty of Alberta 8 Agglomeratve Nearest Neghbours grouped trag data New etry Fd closest cluster Dstace fucto Class label of ew etry Uversty of Alberta 83 Dstace Measures The most used dstace fucto s the Euclda dstace: d( X, Y ) However, other measures are possble the Mahatta dstace: d( X, Y ) ( x y ) the Chebychev: d( X, Y ) max ( x y ) the cose measure: ( x ). y d( X, Y ) Pearso s correlato: x. y ( x ) x).( y y d( X, Y ) ( x x). ( y y) Uversty of Alberta 84 ( x y )
22 Smlarty of Categorcal Smlarty measure s the verse of dstace measure For categorcal or Bary data the dstace s: Smple match: Jaccard Coeffcet: d ( X, Y ) where Uversty of Alberta 85 ( x, y ) ( x, y ) 0 f x y; ( x, y ) ( x, y ) otherwse I Sm( X, Y ) U Attrbute Weghts Not all attrbutes have the same mportace measurg smlarty (or dstace). Attrbutes could be weghted. d No-weghted ( X, Y ) ( x y ) d( X, Y ) w *( x y ) versus weghted Euclda dstace Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 86 Atr? CL What s a Decso Tree? A decso tree s a flow-chart-lke tree structure. Iteral ode deotes a test o a attrbute Brach represets a outcome of the test All tuples brach have the same value for the tested attrbute. Leaf ode represets class label or class label dstrbuto. Atr? Atr? Atr? Atr? CL CL CL CL CL CL CL A Example from Qula s ID3 Trag set Outlook Tempreature Humdty Wdy Class suy hot hgh false N suy hot hgh true N overcast hot hgh false P ra mld hgh false P ra cool ormal false P ra cool ormal true N overcast cool ormal true P suy mld hgh false N suy cool ormal false P ra mld ormal false P suy mld ormal true P overcast mld hgh true P overcast hot ormal false P ra mld hgh true N Uversty of Alberta 87 Uversty of Alberta 88
23 N Humdty? A Sample Decso Tree suy Outlook? overcast Wdy? hgh ormal true false P P N ra Outlook Tempreature Humdty Wdy Class suy hot hgh false N suy hot hgh true N overcast hot hgh false P ra mld hgh false P ra cool ormal false P ra cool ormal true N overcast cool ormal true P suy mld hgh false N suy cool ormal false P ra mld ormal false P suy mld ormal true P overcast mld hgh true P overcast hot ormal false P ra mld hgh true N P Decso-Tree Classfcato Methods The basc top-dow decso tree geerato approach usually cossts of two phases:. Tree costructo At the start, all the trag examples are at the root. Partto examples are recursvely based o selected attrbutes.. Tree prug Amg at removg tree braches that may reflect ose the trag data ad lead to errors whe classfyg test data mprove classfcato accuracy. Uversty of Alberta 89 Uversty of Alberta 90 Decso Tree Costructo Choosg the Attrbute to Splt Set CL Atr? Recursve process: Tree starts a sgle ode represetg all data. If sample are all same class the ode becomes a leaf labeled wth class label. Otherwse, select attrbute that best separates sample to dvdual classes. Recurso stops whe: Sample ode belog to the same class (majorty); There are o remag attrbutes o whch to splt; There are o samples wth attrbute value. Uversty of Alberta 9 The measure s also called Goodess fucto Dfferet algorthms may use dfferet goodess fuctos: formato ga (ID3/C4.5) assume all attrbutes to be categorcal. ca be modfed for cotuous-valued attrbutes. g dex assume all attrbutes are cotuous-valued. assume there exst several possble splt values for each attrbute. may eed other tools, such as clusterg, to get the possble splt values. ca be modfed for categorcal attrbutes. Uversty of Alberta 9
24 Iformato Ga (ID3/C4.5) Assume that there are two classes, P ad N. Let the set of examples S cota x elemets of class P ad y elemets of class N. The amout of formato, eeded to decde f a arbtrary example S belog to P or N s defed as: x x y y I( SP, SN ) log log I( s, s x+ y x + y x + y,..., s) p log ( p x+ y I geeral Assume that usg attrbute A as the root the tree wll partto S sets {S, S,, S v }. If S cotas x examples of P ad y examples of N, the formato eeded to classfy objects all subtrees S : E( A) v x + y I( S x y + P, S N ) I geeral ) v E( A) I( s, s,..., s ) s Uversty of Alberta 93 s s s p s estmated by s /s Iformato Ga -- Example The attrbute A s selected such that the formato ga ga(a) I(S P,S N ) - E(A) s maxmal, that s, E(A) s mmal sce I(S P,S N ) s the same to all attrbutes at a ode. I the gve sample data, attrbute outlook s chose to splt at the root : ga(outlook) 0.46 ga(temperature) 0.09 ga(humdty) 0.5 ga(wdy) Iformato ga measure teds to favor attrbutes wth may values. Other possbltes: G Idex, χ, etc. Uversty of Alberta 94 G Idex If a data set S cotas examples from classes, g dex, g(s) s defed as g ( S ) p j j where p j s the relatve frequecy of class j S. If a data set S s splt to two subsets S ad S wth szes N ad N respectvely, the g dex of the splt data cotas examples from classes, the g dex g(s) s defed as N N g ( S ) g ( S ) + g ( S ) splt N N The attrbute that provdes the smallest g splt (S) s chose to splt the ode (eed to eumerate all possble splttg pots for each attrbute). Uversty of Alberta 95 Example for g Idex Suppose there two attrbutes: age ad come, ad the class label s buy ad ot buy. There are three possble splt values for age: 30, 40, 50. There are two possble splt values for come: 30K, 40K We eed to calculate the followg g dex g age 30 (S), g age 40 (S), g age 50 (S), g come 30k (S), g come 40k (S) Choose the mmal oe as the splt attrbute Uversty of Alberta 96
25 Prmary Issues Tree Costructo Splt crtero: Used to select the attrbute to be splt at a tree ode durg the tree geerato phase. Dfferet algorthms may use dfferet goodess fuctos: formato ga, g dex, etc. Brachg scheme: Determg the tree brach to whch a sample belogs. bary splttg (g dex) versus may splttg (formato ga). Stoppg decso: Whe to stop the further splttg of a ode, e.g. mpurty measure. Labelg rule: a ode s labeled as the class to whch most samples at the ode belog. Algorthm How to costruct a tree? greedy algorthm make optmal choce at each step: select the best attrbute for each tree ode. top-dow recursve dvde-ad-coquer maer from root to leaf splt ode to several braches for each brach, recursvely ru the algorthm Uversty of Alberta 97 Uversty of Alberta 98 Example for Algorthm (ID3) All attrbutes are categorcal Create a ode N; f samples are all of the same class C, the retur N as a leaf ode labeled wth C. f attrbute-lst s empty the retur N as a left ode labeled wth the most commo class. Select splt-attrbute wth hghest formato ga label N wth the splt-attrbute for each value A of splt-attrbute, grow a brach from Node N let S be the brach whch all tuples have the value A for splt- attrbute f S s empty the attach a leaf labeled wth the most commo class. Else recursvely ru the algorthm at Node S Utl all braches reach leaf odes Uversty of Alberta 99 Drectly How to use a tree? test the attrbute value of ukow sample agast the tree. A path s traced from root to a leaf whch holds the label. Idrectly decso tree s coverted to classfcato rules. oe rule s created for each path from the root to a leaf. IF-THEN rules are easer for humas to uderstad. Uversty of Alberta 00
26 Avod Over-fttg Classfcato A tree geerated may over-ft the trag examples due to ose or too small a set of trag data. Two approaches to avod over-fttg: (Stop earler): Stop growg the tree earler. (Post-prue): Allow over-ft ad the post-prue the tree. Approaches to determe the correct fal tree sze: Separate trag ad testg sets or use cross-valdato. Use all the data for trag, but apply a statstcal test (e.g., ch-square) to estmate whether expadg or prug a ode may mprove over etre dstrbuto. Use Mmum Descrpto Legth (MDL) prcple: haltg growth of the tree whe the ecodg s mmzed. Rule post-prug (C4.5): covertg to rules before prug. Cotuous ad Mssg Values Decso-Tree Iducto Dyamcally defe ew dscrete-valued attrbutes that partto the cotuous attrbute value to a dscrete set of tervals. Temperature play tes No No Yes Yes Yes No Sort the examples accordg to the cotuous attrbute A, the detfy adjacet examples that dffer ther target classfcato, geerate a set of caddate thresholds mdway, ad select the oe wth the maxmum ga. Extesble to splt cotuous attrbutes to multple tervals. Assg mssg attrbute values ether Assg the most commo value of A(x). Assg probablty to each of the possble values of A. Uversty of Alberta 0 Uversty of Alberta 0 Alteratve Measures for Selectg Attrbutes Ifo ga aturally favours attrbutes wth may values. Oe alteratve measure: ga rato (Qula 86) whch s to pealze attrbute wth may values. SpltIfo ( S, A) S log S S. S GaRato ( S, A) Ga ( S, A) ( S, A). SpltIfo Problem: deomator ca be 0 or close whch makes GaRato very large. Dstace-based measure (Lopez de Mataras 9): defe a dstace metrc betwee parttos of the data. choose the oe closest to the perfect partto. There are may other measures. Mgers 9 provdes a expermetal aalyss of effectveess of several selecto measures over a varety of problems. Uversty of Alberta 03 Tree Prug A decso tree costructed usg the trag data may have too may braches/leaf odes. Caused by ose, over-fttg. May result poor accuracy for usee samples. Prue the tree: merge a subtree to a leaf ode. Usg a set of data dfferet from the trag data. At a tree ode, f the accuracy wthout splttg s hgher tha the accuracy wth splttg, replace the subtree wth a leaf ode, label t usg the majorty class. Issues: Obtag the testg data. Crtera other tha accuracy (e.g. mmum descrpto legth). Uversty of Alberta 04
27 Prug Crtero Use a separate set of examples to evaluate the utlty of post-prug odes from the tree. CART uses cost-complexty prug. Apply a statstcal test to estmate whether expadg (or prug) a partcular ode. C4.5 uses pessmstc prug. Mmum Descrpto Legth (o test sample eeded). SLIQ ad SPRINT use MDL prug. Uversty of Alberta 05 Prug Crtero --- MDL Best bary decso tree s the oe that ca be ecoded wth the fewest umber of bts Selectg a scheme to ecode a tree Comparg varous subtrees usg the cost of ecodg The best model mmzes the cost Ecodg schema Oe bt to specfy whether a ode s a leaf (0) or a teral ode () loga bts to specfy the splttg attrbute Splttg the value for the attrbute: categorcal --- log(v-) bts umercal --- log v - Uversty of Alberta 06 Part III: k-nearest Neghbour Lazy Learg Nearest Neghbour K-Nearest eghbours Lecture Outle Agglomeratve Nearest Neghbours Part IV: Decso Trees ( hour) What s a Decso Tree? Buldg a tree Prug a tree Part V: Assocatve Classfers ( hour) Rule Geerato Rule Prug Rule Selecto Rule Combato (30 mutes) Uversty of Alberta 07 How do Assocatve Classfers Work? Trasacto ID Items Bought 000 X,Y,Z 000 X,Z 4000 X,V 5000 U,V,W Atr Atr Atr3 AtrN Class Label {Td, Item, Item, Item 3, Item t } {Td, Item, Item, Item 3, Item t } {Item, Item, Item 3, Item N, Class } {Item, Item, Item 3, Item N, Class } Costraed Assocato Rules Frequet k-temsets {Item a, Item b, Item k } Rules {Itemset Itemset} Costraed Itemsets Frequet k-temsets {Item a, Item k, Class x } {Itemset Class} Uversty of Alberta 08
28 Automatc dagostc Backgroud, Motvato ad Geeral Outle of the Proposed Project We have bee collectg tremedous amouts of formato coutg o the power of computers to help effcetly sort through ths amalgam of formato. Ufortuately, these massve collectos of data stored o dsparate dspersed meda very rapdly become overwhelmg. Regrettably, most of the collected large datasets rema uaalyzed due to lack of approprate, effectve ad scalable techques. Modelg documets {bread, mlk, beer, } (Bread, mlk) {term, term,,ca} (term, Ca) {f, f,,ca} (f3, f5, Ca) Bread mlk term Ca f3^f5 Ca Model put data to trasactos Set of trasactos <{,,, k },c> Trasactos (Trag ) Rule Geerato Assocato Rules Geeral Approach Set of rules Rule Prug Prued Rules Set of rules New object Rule Selecto Applcable Rules Ulabeled ew objects Selected Rules Also modeled to trasactos Labeled objects New object labelled Uversty of Alberta 09 Uversty of Alberta 0 Assocato Rules - Classfcato for all Categores CBA (998) Category Category Category [Apror- cofdece] Sgle class CMAR (00) [FP-Growth χ] Sgle class ARC-AC (00) [Apror cofdece vote] Mult class Assocato Rules for all Categores New objects Assocatve Classfer ARC-AC Put objects ts predcted class Assocato Rules - Classfcato by Category ARC-BC (00) Category Category Category Assocato Rules for Category Assocato Rules for Category Assocato Rules for Category New objects Assocatve Classfer ARC-BC Put objects ts predcted class Uversty of Alberta Uversty of Alberta
29 Assocato Rules: Advatages & Issues AR are well studed fast scalable No depedece assumpto btw. attrbutes Attrbutes: large umber varable umber, ca hadle mssg values Trasparecy AC are a early stage of developmet use smple rules aïve selecto fucto AC models cosst of a large umber of rules harder selecto redudat, uterestg rules loger classfcato tme dffcult to maually revst rules Soluto: Prug Techques Uversty of Alberta 3 Large umber of rules Prug Rules Nosy formato Soluto: Prug Techques Log classfcato tme Removg low raked specalzed rules; R : F C Cofdece 90% R R F F C Cofdece 80% : Elmate coflctg rules (for sgle-class classfcato); F C F C base coverage; Uversty of Alberta 4 Classfcato Stage Let S be the classfcato system A ew object O <f; f3; f4; f7; f9 > f C cofdece 0.9 f3 & f4 C cofdece 0.85 f4 C cofdece 0.8 f7 C cofdece 0.6 f9 C3 cofdece 0.5 C 0.85 C 0.75 C3 0.5 Usg the domace factor we chose the wg categores. If δ00% C s wg. If δ80% O s predcted to fall C ad C. Model put data to trasactos Set of trasactos <{,,, k },c> Rule Geerato Set of rules Summary Rule Prug Set of rules Rule Selecto Ulabeled ew objects Labeled objects Uversty of Alberta 5 Uversty of Alberta 6
30 Learg Set of trasactos <{,,, k },c> Trag Modellg trasactos to corporate more formato Rule Geerato Support thresholdfree rule geerato Ope Problems? Set of rules Assocato Rules Rule value measure Rule Prug New heurstcs ad ew prug strateges Set of rules Prued Rules Rakg rules New object Rule Selecto Applcable Rules Rule represetato Classfcato Selected Rules New heurstcs ad ew selecto strateges New object labelled What s Predcto? The goal of predcto s to forecast or deduce the value of a attrbute based o values of other attrbutes. A model s frst created based o the data dstrbuto. The model s the used to predct future or ukow values. I Mg If forecastg dscrete value Classfcato If forecastg cotuous value Predcto Uversty of Alberta 7 Uversty of Alberta 8 Predcto Predcto of cotuous values ca be modeled by statstcal techques. Lear regresso Multple regresso Polyomal regresso Posso regresso Log-lear regresso Etc. Uversty of Alberta 9 Lear Regresso Lear regresso: Approxmate data dstrbuto by a le Y α + βx Y s the respose varable ad X the predctor varable. α ad β are regresso coeffcets specfyg the tercept ad the slope of the le. They are calculated by least square method: β s s ( x x )( y ( x x ) y ) α y β x Where x ad y are respectvely the average of x, x,, x s ad y, y,,y s. Multple regresso: Y α + β X + β X. May olear fuctos ca be trasformed to the above. Uversty of Alberta 0
Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier
Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationSimple Linear Regression
Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationUnsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationMultiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades
STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos
More informationDimensionality reduction Feature selection
CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data
More informationFeature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationSolving Constrained Flow-Shop Scheduling. Problems with Three Machines
It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632
More informationKernel-based Methods and Support Vector Machines
Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg
More informationMachine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18
Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationNaïve Bayes MIT Course Notes Cynthia Rudin
Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.
More informationGenerative classification models
CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More information6. Nonparametric techniques
6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationFor combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.
Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the
More informationbest estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best
Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More informationStatistics MINITAB - Lab 5
Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of
More informationTHE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA
THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for
More informationLecture Notes Types of economic variables
Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte
More informationOutline. Point Pattern Analysis Part I. Revisit IRP/CSR
Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:
More informationThe Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)
We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the
More informationThis lecture and the next. Why Sorting? Sorting Algorithms so far. Why Sorting? (2) Selection Sort. Heap Sort. Heapsort
Ths lecture ad the ext Heapsort Heap data structure ad prorty queue ADT Qucksort a popular algorthm, very fast o average Why Sortg? Whe doubt, sort oe of the prcples of algorthm desg. Sortg used as a subroute
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More information(b) By independence, the probability that the string 1011 is received correctly is
Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty
More informationStatistics: Unlocking the Power of Data Lock 5
STAT 0 Dr. Kar Lock Morga Exam 2 Grades: I- Class Multple Regresso SECTIONS 9.2, 0., 0.2 Multple explaatory varables (0.) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (0.2) Exam 2 Re- grades Re-
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK
ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK Ram Rzayev Cyberetc Isttute of the Natoal Scece Academy of Azerbaa Republc ramrza@yahoo.com Aygu Alasgarova Khazar
More informationBlock-Based Compact Thermal Modeling of Semiconductor Integrated Circuits
Block-Based Compact hermal Modelg of Semcoductor Itegrated Crcuts Master s hess Defese Caddate: Jg Ba Commttee Members: Dr. Mg-Cheg Cheg Dr. Daqg Hou Dr. Robert Schllg July 27, 2009 Outle Itroducto Backgroud
More informationModel Fitting, RANSAC. Jana Kosecka
Model Fttg, RANSAC Jaa Kosecka Fttg: Issues Prevous strateges Le detecto Hough trasform Smple parametrc model, two parameters m, b m + b Votg strateg Hard to geeralze to hgher dmesos a o + a + a 2 2 +
More informationL5 Polynomial / Spline Curves
L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a
More informationSimulation Output Analysis
Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5
More informationA tighter lower bound on the circuit size of the hardest Boolean functions
Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More information2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.
.5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More informationPrincipal Components. Analysis. Basic Intuition. A Method of Self Organized Learning
Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationChapter 9 Jordan Block Matrices
Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.
More informationSystematic Selection of Parameters in the development of Feedforward Artificial Neural Network Models through Conventional and Intelligent Algorithms
THALES Project No. 65/3 Systematc Selecto of Parameters the developmet of Feedforward Artfcal Neural Network Models through Covetoal ad Itellget Algorthms Research Team G.-C. Vosakos, T. Gaakaks, A. Krmpes,
More informationChapter 13 Student Lecture Notes 13-1
Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationPTAS for Bin-Packing
CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,
More informationCSE 5526: Introduction to Neural Networks Linear Regression
CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto
More informationModule 7: Probability and Statistics
Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to
More informationChapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance
Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss
More informationKLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames
KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots
More informationMULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov
Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More information12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model
1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed
More informationSTA302/1001-Fall 2008 Midterm Test October 21, 2008
STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationSimple Linear Regression
Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal
More informationUNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS
Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method
More informationSection l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58
Secto.. 6l 34 6h 667899 7l 44 7h Stem=Tes 8l 344 Leaf=Oes 8h 5557899 9l 3 9h 58 Ths dsplay brgs out the gap the data: There are o scores the hgh 7's. 6. a. beams cylders 9 5 8 88533 6 6 98877643 7 488
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the
More informationChapter Statistics Background of Regression Analysis
Chapter 06.0 Statstcs Backgroud of Regresso Aalyss After readg ths chapter, you should be able to:. revew the statstcs backgroud eeded for learg regresso, ad. kow a bref hstory of regresso. Revew of Statstcal
More informationSTA 105-M BASIC STATISTICS (This is a multiple choice paper.)
DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationMean is only appropriate for interval or ratio scales, not ordinal or nominal.
Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationFault Diagnosis Using Feature Vectors and Fuzzy Fault Pattern Rulebase
Fault Dagoss Usg Feature Vectors ad Fuzzy Fault Patter Rulebase Prepared by: FL Lews Updated: Wedesday, ovember 03, 004 Feature Vectors The requred puts for the dagostc models are termed the feature vectors
More informationRegression and the LMS Algorithm
CSE 556: Itroducto to Neural Netorks Regresso ad the LMS Algorthm CSE 556: Regresso 1 Problem statemet CSE 556: Regresso Lear regresso th oe varable Gve a set of N pars of data {, d }, appromate d b a
More informationLecture 1 Review of Fundamental Statistical Concepts
Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationCHAPTER 4 RADICAL EXPRESSIONS
6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More information1 Onto functions and bijections Applications to Counting
1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of
More informationChapter Two. An Introduction to Regression ( )
ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the
More informationC. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1
C. Statstcs a. Descrbe the stages the desg of a clcal tral, takg to accout the: research questos ad hypothess, lterature revew, statstcal advce, choce of study protocol, ethcal ssues, data collecto ad
More information13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations
Lecture 7 3. Parametrc ad No-Parametrc Ucertates, Radal Bass Fuctos ad Neural Network Approxmatos he parameter estmato algorthms descrbed prevous sectos were based o the assumpto that the system ucertates
More informationSupport vector machines
CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More informationTowards Multi-Layer Perceptron as an Evaluator Through Randomly Generated Training Patterns
Proceedgs of the 5th WSEAS It. Cof. o Artfcal Itellgece, Kowledge Egeerg ad Data Bases, Madrd, Spa, February 5-7, 26 (pp254-258) Towards Mult-Layer Perceptro as a Evaluator Through Ramly Geerated Trag
More information4. Standard Regression Model and Spatial Dependence Tests
4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.
More informationOvercoming Limitations of Sampling for Aggregation Queries
CIS 6930 Approxmate Quer Processg Paper Presetato Sprg 2004 - Istructor: Dr Al Dobra Overcomg Lmtatos of Samplg for Aggregato Queres Authors: Surajt Chaudhur, Gautam Das, Maur Datar, Rajeev Motwa, ad Vvek
More informationA New Family of Transformations for Lifetime Data
Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several
More informationExample: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger
Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0
More informationA Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies
ISSN 1684-8403 Joural of Statstcs Volume 15, 008, pp. 44-53 Abstract A Combato of Adaptve ad Le Itercept Samplg Applcable Agrcultural ad Evrometal Studes Azmer Kha 1 A adaptve procedure s descrbed for
More informationLogistic regression (continued)
STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory
More informationAnalysis of Variance with Weibull Data
Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad
More information