Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that makes them perform better a future. (Wtte, Frak, 1999) types of learg: kowledge acqusto skll refemet Relato betwee mache learg ad data mg P. Berka, 2018 1/18
learg decso makg select represetato learg kowledge object object decso decso descrpto makg Geeral scheme of a learg system Learg methods: 1. rote learg, 2. learg from structo, learg by beg told), 3. learg by aalogy, stace-based learg, lazy learg, 4. eplaato-based learg, 5. learg from eamples, 6. learg from observato ad dscovery. P. Berka, 2018 2/18
Learg methods: statstcal methods - regresso methods, dscrmat aalyss, cluster aalyss, symbolc mache learg methods - decso trees ad rules, case-based reasog (CBR) sub-symbolc mache learg methods euroal etworks, bayesa etworks or geetc algorthms. Feedback durg learg: pre-classfed eamples (supervsed learg) small umber of pre-classfed eamples ad a large umber of eamples wthout kow class (sem-supervsed learg) algorthm ca query the teacher for class membershp for uclassfed eamples (actve learg), drect hts derved from the teacher s behavor (appretceshp learg) o feedback (usupervsed learg) P. Berka, 2018 3/18
represetato of eamples: 1. attrbutes: categoral (bary, omal, ordal) ad umerc [har=black & heght=180 & beard=yes & educato=uv] 2. relatos father(ja_lucembursky, karel_iv) Algorthms: batch all eamples are processed at oce cremetal eamples are processed subsequetly system ca be re-traed Learg methods: emprcal uses large set of (trag) eamples ad lmted (or o) backgroud kowledge aalytc uses large backgroud kowledge ad several (oe or eve o) llustratve eamples P. Berka, 2018 4/18
Prcples of emprcal cocept learg 1. eamples of the same class have smlar characterstcs (smlarty-based learg) 1. eamples of the same class create clusters the attrbute space The goal of learg s to fd ad represet these clusters garbage, garbage out problem Importace of data uderstadg ad preprocessg P. Berka, 2018 5/18
2. Geeral kowledge ferred from a fte set of eamples (ductve learg) Eamples dvded to 2 (or 3) sets: o trag set to buld a model o (valdato set to tue the parameters) o testg set to test the model P. Berka, 2018 6/18
Geeral defto of (supervsed) mache learg Aalyzed data: D : 11 2 1 1 12 2 2 : 2......... 1m 2 m : m Rows the table represet objects (eamples, staces) Colums the table correspod to attrbutes (varables) Whe addg target attrbute to the data table, we obta data sutable for supervsed learg methods (so called trag data). D TR : 11 2 1 1 12 2 2 : 2......... 1m 2 m : m y y y : 1 2 Classfcato task: to fd kowledge (represeted by a decso fucto f), that assgs value of target attrbute y to a object descrbed by values of put attrbutes f: y. P. Berka, 2018 7/18
We fer durg classfcato for values of put attrbutes for a object the value of target attrbutes: ŷ = f (). The derved value ŷ ca be dfferet from the real value y. We ca thus compute for every object o D TR the classfcato error Q f (o, ŷ ). for umerc attrbute C e.g. as: Q (, y ) = (y - y ) 2 f o for categoral attrbute C e.g. as: Q f ( o, yˆ 1 ) = 0 ff ff y y yˆ = yˆ We ca compute the overall error Err(f,D TR ) for the whole trag set D TR e.g. as mea error:. =1 Err(f,D = 1 TR ) Q f ( o, y ) The goal of learg s to fd such kowledge f*, that wll mmze ths error Err(f*,D TR ) m Err(f,DTR ). f P. Berka, 2018 8/18
1. Learg as search Lookg for both structure ad parameters of the model Models as cluster descrptos: MGM most geeral model (oe cluster for all eamples) MSM most specfc model(s) (each eample creates a cluster) M1 s more geeral tha M2, M2 s more specfc tha M1 1 1 B( ) B( k), B(0) 1 k k 1 2 3 4 5 10 B( ) 1 2 5 Bell umbers 115975 P. Berka, 2018 9/18 15 52
Search methods: Drecto Top dow (from geeral to specfc models) Bottom up (from specfc to geeral models) Strategy bld (we cosder each possblty how to specalze/geeralze gve model) heurstc (we use some crtero to select oly the best possbltes how to specalze/geeralze gve model) radom Badwdth sgle (we cosder oly oe trasformato of actual model) parallel (we cosder more trasformatos) P. Berka, 2018 10/18
Eample: Let us assume, that both put attrbutes ad target attrbute are categoral let us deote category the value of a attrbute: 1. atomc formula that epresses property of object o : A (v j k )( o ) 1 0 pro pro j j v v 2. set of objects that fulfll gve property A (v ) { o : } j k j v k k k Combatos are created from categores usg logcal AND Comb [ A (v ), A (v ),...,A (v )] A (v ) A (v )... A j1 k1 j2 k2 jl kl j1 k1 j2 k2 jl kl (v ) 1. 1 o Comb( o ) 0 f else v v... v j1 k1 j 2 k2 j l kl : 2. Comb { : vk j vk... j } o. j v 1 1 2 2 l k l Comb covers object o ff Comb(o ) = 1 We ca create supercombatos by addg categores to a combato ad create subcombatos by removg categores from a combato. P. Berka, 2018 11/18
Partal orderg betwee combatos: If combato Comb 1 s a subcombato of combato Comb 2, the combato Comb 1 s more geeral tha combato Comb 2 ad combato Comb 2 s more specfc tha combato Comb 1. If combato Comb 1 s more geeral tha combato Comb 2, the Comb 1 covers at least all objects that are covered by Comb 2. (dowward-closure property) The resultg kowledge wll be represeted by combatos that cover oly eamples of gve class. Combato Comb s cosstet, ff t covers oly eamples of a sgle class: C(v t ) o Eample data: D TR : Comb( o ) 1 y v příjem koto pohlaví ezaměstaý auto bydleí úvěr vysoký vysoké žea e ao vlastí Ao vysoký vysoké muž e ao vlastí Ao zký ízké muž e ao ájemí Ne vysoký vysoké muž e e ájemí Ao Combato Comb (hypothess represetg the cocept úvěr ) ca cota followg values of a attrbute: t? to dcate that the value of ths attrbute s rrelevat, value of the attrbute, to dcate that o value of ths attrbute s applcable. P. Berka, 2018 12/18
[?,?,?,?,?,?]... [?,?, žea,?,?,?] [vysoký,?,?,?,?,?] [?,vysoké,?,?,?,?] [?,?,?,?,?, vlastí]... [vysoký,?,?, e,?,?] [vysoký,vysoké,?,?,?,?] [?, vysoké,?, e,?,?,?] [vysoký,vysoké,?, e,?,?] [vysoký,vysoké,?, e,ao,?] [vysoký,vysoké,?,e,?,vlastí] [vysoký,vysoké,muž, e,?,?] [vysoký, vysoké,?, e,ao, vlastí] [vysoký, vysoké, muž,e,?,vlastí] [vysoký, vysoké, žea,e,?,vlastí] [vysoký, vysoké, muž,e, ao,?] [vysoký, vysoké, muž,e, e,?] [vysoký,vysoké,žea,e, ao, vlastí] [vysoký,vysoké,muž,e, ao, vlastí] [vysoký,vysoké,muž,e, e, ájemí] [,,,,, ] Hypothess space We ca traverse the hypothess space usg two methods: from geeral to specfc (top-dow, specalzato), from specfc to geeral (bottom-up, geeralzato). P. Berka, 2018 13/18
Fd-S algorthm 1. Italze h to the most specfc hypothess H 2. For each postve trag eample 2.1. For each attrbute a from hypothess h f value of attrbute a does ot correspod to the replace value of a by the et more geeral value that correspods to 3. output h S: [vysoký, vysoké,?,e,?,?] [vysoký,vysoké,?, e,ao,?] [vysoký,vysoké,?,e,?,vlastí] [vysoký,vysoké,muž, e,?,?] [vysoký, vysoké,?, e,ao, vlastí] [vysoký, vysoké, muž,e, e,?] [vysoký,vysoké,žea,e, ao, vlastí] [vysoký,vysoké,muž,e, ao, vlastí] [vysoký,vysoké,muž,e, e, ájemí] P. Berka, 2018 14/18
Caddate-Elmato algorthm 1. Italze G to the set of mamally geeral hypotheses H 2. Italze S to the set of mamally specfc hypotheses H 3. for each eample 3.1. f s a postve eample the remove form G ay hypothess cosstet wth for each hypothess s S that s ot cosstet wth remove s from S add to S mmal geeralzato h of s such, that h s cosstet wth ad some member of G s more geeral tha h remove from S hypotheses that are more geeral tha aother hypothess S 3.2. f s a egatve eample the remove from S ay hypothess cosstet wth for each hypothess g G that s ot cosstet wth remove g from G add to G mmal specalzato h of g such, that h s cosstet wth ad some member of S s more specfc tha h remove from G hypotheses that are more specfc tha aother hypothess G G: [vysoký,?,?,?,?,?] [?, vysoké,?,?,?,?] [vysoký,?,?, e,?,?] [vysoký,vysoké,?,?,?,?] [?, vysoké,?, e,?,?,?] S: [vysoký, vysoké,?,e,?,?] P. Berka, 2018 15/18
2. Learg as appromato Lookg oly for parameters of the model Eample: usg fte umber of data pots fd parameters of a (geeral) fucto to best ft the data y=f() f() = q 1 + q 0 least squares method: the problem of fdg the mmum of the overall error m (y - f( )) 2 s trasformed to solvg the equato d dq (y - f( )) 2 = 0 P. Berka, 2018 16/18
soluto: 1) aalytc (we kow the type of fucto) solvg the equatos for the parameters of fuctos q 0 = ( ky k )( k k2 ) - ( k k y k )( k k ) ( k k2 ) - ( k k ) 2 q 1 = ( k k y k ) - ( k k )( k y k ) ( k k2 )- ( k k ) 2 2) umerc (we do ot kow the type of fucto) gradet methods Err(q) = Err q 0, Err q 1,..., Err q Q. Modfcato of kowledge q = [q 0, q 1,..., q Q ] accordg the algorthm where q j q j + q j Δq j Err - η q j ad s a parameter epressg step used to approach the mmum of fucto Err. P. Berka, 2018 17/18
E.g. for error fucto 1 Err(f,D = (y - y 1 TR ) ) (y - f`( )) 2 2 =1 2 2 ad epected fucto f as lear combato of puts f() = q, =1 we ca derve the gradet of fucto Err as Err q So j = 1 2 2 y - y~ = 2y - y~ y - y~ 1 q j 2 1 q j 1 y - y~ y - q = y - y~ - j = q 1 j 1 q = y - y j j =1 Problem wth covergece to local mmum P. Berka, 2018 18/18