I. Decision trees II. Ensamble methods: Mixtures of experts

Size: px

Start display at page:

Download "I. Decision trees II. Ensamble methods: Mixtures of experts"

Lora Harris
5 years ago
Views:

1 CS 75 Machne Learnn Lectre 4 I. Decson trees II. Ensamble methods: Mtres of eperts Mlos Hasrecht mlos@cs.ptt.ed 539 Sennott Sqare CS 75 Machne Learnn Eam: Aprl 8 7 Schedle Term proects & proect presentatons: Aprl 5 7 At :-4:pm n SNSQ 533 No class: on Aprl 3 7 CS 75 Machne Learnn

2 Decson trees An alternatve approach to classfcaton: Partton the npt space to reons Classf ndependentl n ever reon CS 75 Machne Learnn Decson trees The parttonn dea s sed n the decson tree model: Splt the space recrsvel accordn to npts n Classf assn class label at the bottom of the tree Eample: Bnar classfcaton {} Bnar attrbtes 3 3 t f t f t f CS 75 Machne Learnn

3 Decson trees How to constrct the decson tree? Top-bottom alorthm: Fnd the best splt condton qantfed based on the mprt measre on the trann set Stops when no mprovement possble Imprt measre: Measres how well are the two classes separated Ideall we wold le to separate all s and Splts of fnte vs. contnos vale attrbtes Contnos vale attrbtes condtons: 3. 5 CS 75 Machne Learnn Let Imprt measre D - Total nmber of data entres D - Nmber of data entres classfed as D p - rato of nstances classfed as D Imprt measre defnes how well are the classes n the trann dataset separated In eneral the mprt measre shold satsf: Larest when data are splt evenl to classes p nmber of classes Shold be when all data belon to the same class CS 75 Machne Learnn

4 Imprt measres There are varos mprt measres sed n the lteratre Entrop based measre Qnlan C4.5 I D Entrop D p lo p Eample for Gn measre Breman CART I D Gn D CS 75 Machne Learnn p Decson-tree bldn Gan de to splt epected redcton n the mprt measre entrop eample Gan D A Entrop v D t EntropD D v Vales A CS 75 Machne Learnn v D Entrop D D - a partton of D wth the vale of attrbte A v 3 f EntropD t t f t f Entrop D Entrop D v 3 f

5 Decson tree learnn Greed learnn alorthm: Repeat ntl no or small mprovement n the prt Fnd the attrbte wth the hhest an Add the attrbte to the tree and splt the set accordnl Blds the tree n the top-down fashon Gradall epands the leaves of the partall blt tree The method s reed It loos at a snle attrbte and an n each step Ma fal when the combnaton of attrbtes s needed to mprove the prt part fnctons CS 75 Machne Learnn Decson tree learnn Lmtatons of reed methods Cases n whch a combnaton of two or more attrbtes mproves the mprt CS 75 Machne Learnn

6 Decson tree learnn B redcn the mprt measre we can row ver lare trees Problem: Overfttn We ma splt and classf ver well the trann set bt we ma do worse n terms of the eneralzaton error Soltons to the overfttn problem: Solton. Prne branches of the tree blt n the frst phase Use nternal valdaton set to test for the overft Solton. Test for the overft n the tree bldn phase Stop bldn the tree when performance on the valdaton set deterorates CS 75 Machne Learnn Mtre of eperts model Ensamble methods: Use a combnaton of smpler learners to mprove predctons Mtre of epert model: Dfferent npt reons covered wth dfferent learners A soft swtchn between learners Mtre of eperts Epert learner CS 75 Machne Learnn

7 Mtre of eperts model Gatn networ : decdes what epert to se... - atn fnctons Gatn networ Epert Epert... Epert CS 75 Machne Learnn Learnn mtre of eperts Learnn conssts of two tass: Learn the parameters of ndvdal epert networs Learn the parameters of the atn networ Decdes where to mae a splt Assme: atn fnctons ve probabltes... Based on the probablt we partton the space parttons belons to dfferent eperts How to model the atn networ? A mltwa classfer model: softma model a eneratve classfer model CS 75 Machne Learnn

8 CS 75 Machne Learnn Learnn mtre of eperts Assme we have a set of lnear eperts Assme a softma atn networ Lelhood of assmed that errors for dfferent eperts are normall dstrbted wth the same varance T µ ep ep T T p Note: bas terms are hdden n Θ p P P Θ ep ep ep σ µ σ π T T CS 75 Machne Learnn Learnn mtre of eperts Gradent learnn. On-lne pdate rle for parameters of epert If we now the epert that s responsble for If we do not now the epert µ α + h µ α + h - responsblt of the th epert a nd of posteror p p h / ep / ep µ µ -a pror ep... - a lelhood

9 Learnn mtres of eperts Gradent methods On-lne learnn of atn networ parameters + β h The learnn wth condtoned mtres can be etended to learnn of parameters of an arbtrar epert networ e.. lostc reresson mltlaer neral networ l + β l µ µ l h µ CS 75 Machne Learnn Learnn mtre of eperts EM alorthm offers an alternatve wa to learn the mtre Alorthm: Intalze parameters Θ Repeat Set Θ ' Θ. Epectaton step Q Θ Θ' EH X YΘ' lo P H Y X Θ ξ. Mamzaton step Θ ar ma Q Θ Θ ' Θ ntl no or small mprovement n Q Θ Θ' Hdden varables are denttes of epert networs responsble for data ponts CS 75 Machne Learnn

Generative classification models

Generative classification models CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn