Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng (M. Pantc Lecture 9-10: Genetc Algorthms (M. Pantc Lecture 11-12: Evaluatng Hypotheses (THs Lecture 13-14: ayesan Learnng-ML Estmaton (S. Zaferou Lecture 15-16: Expectaton Maxmaton (S. Zaferou Lecture 17-18: Inductve Logc Programmng (S. Muggleton
ayesan Learnng Expectaton Maxmaton Readng: Sldes
ML estmaton Consder a set and a model f D {( x, y1,...,( x n, y 1 n } Maxmum lkelhood estmaton s gven * by: f arg max f p( D f
ML estmaton Assumng that the samples are condtonal ndependent gven f f * arg max f n 1 p( y f Further assumng y f ( x + e y f ~ N( f ( x, σ f * arg max f n 1 1 2 2πσ e 1 2σ 2 ( y f ( x 2
ML estmaton
ML estmaton Choosng to maxme ts logarthm we get f * n 1 1 arg max f (ln 2 2 1 2πσ 2πσ ( y f ( x removng the constant terms we get 2 * f argmn ( f ( x y f n 1 2
ML: smple example Consder a con flppng experment. Par of cons A and of unknown bases θα and A lands on tal wth P 1 θα We want to estmate: Α θ ( θ, θ θ A
ML: smple example A Randomly choose one (wth equal probablty. Perform 10 ndependent tosses (50 con tosses n total.
ML: smple example # (number heads observed durng the -th set of tosses x x1, x2,..., x ( 5 x {0,1,...,10} dentty of the con 1, 2,..., ( 5 { A, }
ML: smple example : # of heads usng con A HA FA : total # of flps usng con A θ Α H F A A θ H F Maxmum Lkelhood estmaton maxmes log P(, x θ
ML: smple example P ( D θ θ H A (1 TA H Α θα θ (1 θ T log P( D θ logθa + Alog(1 θα HA T + H logθ + T log(1 θ log P θα 0 θ Α Η ΗΑ + Τ Α Α
ML: smple example
Expectaton Maxmaton Consder a more challengng setup: We are gven x ( x1, x2,..., x5 latent (or hdden varables. but not Computng proportons of heads for each con s no longer possble
Expectaton Maxmaton t ( t Start wth some ntal parameters θ ( θ A and determne for each of the fve sets whether con A or con was more lkely to have generated the observed flp. Assumng ths data completon s correct apply regular maxmum lkelhood to get Do t untl convergence. ˆ ( ˆ, ˆ ( t θ θˆ ( t+ 1
Expectaton Maxmaton Compute probabltes of all possble completons gven θˆ ( t (not just the most probable one These probabltes are used to create a weghted tranng set consstng of all completons. A modfed ML that deals wth weghted tranng data s appled n order to get the new estmate θˆ ( t+ 1
Expectaton Maxmaton y usng weghted tranng examples rather than choosng the sngle best completon the EM algorthm accounts for the confdence of the model n each completon of the data
Expectaton Maxmaton P( D, θ P( P( D, θ A A P( D A P( D, θ 0.6 H A 0.4 T A A 0.0007962624 P( A 0.5
Expectaton Maxmaton P( D, θ P( D, θ P( D P( P( D, θ 0.5 H 0.5 T 0.0009765625 P( 0.5
0.45, (, (, ( + θ θ θ A A A D P D P D P w 0.55, (, (, ( + θ θ θ A D P D P D P w Expectaton Maxmaton
Expectaton Maxmaton EM Algorthm: E-step: Guessng a probablty dstrbuton over completons of mssng data gven the current model M-step: Re-estmate the parameters model gven these completons.
Expectaton Maxmaton
EM: Mathematcs Startng from an ntal parameters the E-step constructs a functon that lower-bounds. In the M-step, s computed as the maxmum of. In the next E-step, a new lower-bound s constructed; maxmaton of gves and so on
EM: Mathematcs The EM algorthm derves from the fact that for all pd (1 where the nequalty s tght when. Jensen s nequalty for all concave functon (e.g., The above holds by lettng
EM: Mathematcs Now, consderng the update rule, Where Applyng the tghtness condtons (1. Moreover, and snce (. guarantees that s a lower bound of. Update monotonc mprovement of ML for ncomplete data
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng (M. Pantc Lecture 9-10: Genetc Algorthms (M. Pantc Lecture 11-12: Evaluatng Hypotheses (THs Lecture 13-14: ayesan Learnng ML Estmaton (S. Zaferou Lecture 15-16: Expectaton Maxmaton (S. Zaferou Lecture 17-18: Inductve Logc Programmng (S. Muggleton