Machine Learning. Hidden Markov Model. Eric Xing / /15-781, 781, Fall Lecture 17, March 24, 2008

Similar documents
Three Main Questions on HMMs

Learning of Graphical Models Parameter Estimation and Structure Learning

Advanced Machine Learning

EE 6885 Statistical Pattern Recognition

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Density estimation. Density estimations. CS 2750 Machine Learning. Lecture 5. Milos Hauskrecht 5329 Sennott Square

Density estimation III.

Modeling and Predicting Sequences: HMM and (may be) CRF. Amr Ahmed Feb 25

CS344: Introduction to Artificial Intelligence

Fundamentals of Speech Recognition Suggested Project The Hidden Markov Model

Least Squares Fitting (LSQF) with a complicated function Theexampleswehavelookedatsofarhavebeenlinearintheparameters

Density estimation III.

Speech, NLP and the Web

8. Queueing systems lect08.ppt S Introduction to Teletraffic Theory - Fall

Midterm Exam. Tuesday, September hour, 15 minutes

Real-Time Systems. Example: scheduling using EDF. Feasibility analysis for EDF. Example: scheduling using EDF

14. Poisson Processes

The Poisson Process Properties of the Poisson Process

Least squares and motion. Nuno Vasconcelos ECE Department, UCSD

Continuous Time Markov Chains

QR factorization. Let P 1, P 2, P n-1, be matrices such that Pn 1Pn 2... PPA

Comparison of the Bayesian and Maximum Likelihood Estimation for Weibull Distribution

The ray paths and travel times for multiple layers can be computed using ray-tracing, as demonstrated in Lab 3.

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

Determination of Antoine Equation Parameters. December 4, 2012 PreFEED Corporation Yoshio Kumagae. Introduction

Statistics: Part 1 Parameter Estimation

(1) Cov(, ) E[( E( ))( E( ))]

Real-time Classification of Large Data Sets using Binary Knapsack

Solution set Stat 471/Spring 06. Homework 2

Chapter 8. Simple Linear Regression

For the plane motion of a rigid body, an additional equation is needed to specify the state of rotation of the body.

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Fault Tolerant Computing. Fault Tolerant Computing CS 530 Probabilistic methods: overview

-distributed random variables consisting of n samples each. Determine the asymptotic confidence intervals for

The Linear Regression Of Weighted Segments

Quantum Mechanics II Lecture 11 Time-dependent perturbation theory. Time-dependent perturbation theory (degenerate or non-degenerate starting state)

Final Exam Applied Econometrics

Partial Molar Properties of solutions

Cyclone. Anti-cyclone

θ = θ Π Π Parametric counting process models θ θ θ Log-likelihood: Consider counting processes: Score functions:

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

Broadband Constraint Based Simulated Annealing Impedance Inversion

Solution of Impulsive Differential Equations with Boundary Conditions in Terms of Integral Equations

Solution. The straightforward approach is surprisingly difficult because one has to be careful about the limits.

Outline. simplest HMM (1) simple HMMs? simplest HMM (2) Parameter estimation for discrete hidden Markov models

Suppose we have observed values t 1, t 2, t n of a random variable T.

EP2200 Queuing theory and teletraffic systems. 3rd lecture Markov chains Birth-death process - Poisson process. Viktoria Fodor KTH EES

Foundations of State Estimation Part II

of Manchester The University COMP14112 Hidden Markov Models

C(p, ) 13 N. Nuclear reactions generate energy create new isotopes and elements. Notation for stellar rates: p 12

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Outline. Computer Networks: Theory, Modeling, and Analysis. Delay Models. Queuing Theory Framework. Delay Models. Little s Theorem

SUMMATION OF INFINITE SERIES REVISITED

Machine Learning. Topic 4: Measuring Distance

Outline. Queuing Theory Framework. Delay Models. Fundamentals of Computer Networking: Introduction to Queuing Theory. Delay Models.

Feature Space. 4. Feature Space and Feature Extraction. Example: DNA. Example: Faces (appearance-based)

Regression and the LMS Algorithm

FORCED VIBRATION of MDOF SYSTEMS

Brownian Motion and Stochastic Calculus. Brownian Motion and Stochastic Calculus

8. Queueing systems. Contents. Simple teletraffic model. Pure queueing system

Machine Learning. Introduction to Regression. Lecture 3, September 19, Reading: Chap. 3, CB

RATIO ESTIMATORS USING CHARACTERISTICS OF POISSON DISTRIBUTION WITH APPLICATION TO EARTHQUAKE DATA

Mathematical Formulation

Solving fuzzy linear programming problems with piecewise linear membership functions by the determination of a crisp maximizing decision

As evident from the full-sample-model, we continue to assume that individual errors are identically and

4. THE DENSITY MATRIX

Some Probability Inequalities for Quadratic Forms of Negatively Dependent Subgaussian Random Variables

( ) ( ) Weibull Distribution: k ti. u u. Suppose t 1, t 2, t n are times to failure of a group of n mechanisms. The likelihood function is

AML710 CAD LECTURE 12 CUBIC SPLINE CURVES. Cubic Splines Matrix formulation Normalised cubic splines Alternate end conditions Parabolic blending

VARIATIONAL ITERATION METHOD FOR DELAY DIFFERENTIAL-ALGEBRAIC EQUATIONS. Hunan , China,

Reliability Analysis. Basic Reliability Measures

S n. = n. Sum of first n terms of an A. P is

2007 Spring VLSI Design Mid-term Exam 2:20-4:20pm, 2007/05/11

Probabilistic Graphical Models

A New Iterative Method for Solving Initial Value Problems

Chapter 1 - Free Vibration of Multi-Degree-of-Freedom Systems - I

Queuing Theory: Memory Buffer Limits on Superscalar Processing

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134

COMPARISON OF ESTIMATORS OF PARAMETERS FOR THE RAYLEIGH DISTRIBUTION

STOCHASTIC CALCULUS I STOCHASTIC DIFFERENTIAL EQUATION

General Complex Fuzzy Transformation Semigroups in Automata

K3 p K2 p Kp 0 p 2 p 3 p

Linear Regression Linear Regression with Shrinkage

Lecture 3 Topic 2: Distributions, hypothesis testing, and sample size determination

Fresnel Equations cont.

FI 3103 Quantum Physics

Chapter 8: Temporal Analysis

Probability and Statistics. What is probability? What is statistics?

NUMERICAL EVALUATION of DYNAMIC RESPONSE

Fault Tolerant Computing. Fault Tolerant Computing CS 530 Reliability Analysis

Redundancy System Fault Sampling Under Imperfect Maintenance

Lecture 15: Three-tank Mixing and Lead Poisoning

Interval Estimation. Consider a random variable X with a mean of X. Let X be distributed as X X

An Optimized FPN Network Attack Model Based on. Improved Ant Colony Algorithm

Competitive Facility Location Problem with Demands Depending on the Facilities

Density estimation III. Linear regression.

ECE 340 Lecture 15 and 16: Diffusion of Carriers Class Outline:

Probability Bracket Notation and Probability Modeling. Xing M. Wang Sherman Visual Lab, Sunnyvale, CA 94087, USA. Abstract

Moment Generating Function

Reliability Analysis of Sparsely Connected Consecutive-k Systems: GERT Approach

Transcription:

Mache Learg 0-70/5 70/5-78 78 Fall 2008 Hdde Marov Model Erc Xg Lecure 7 March 24 2008 Readg: Cha. 3 C.B boo Erc Xg Erc Xg 2

Hdde Marov Model: from sac o damc mure models Sac mure Damc mure Y Y Y 2 Y 3 Y X N X X 2 X 3 X Erc Xg 3 Hdde Marov Models he uderlg source: geomc ees dce he sequece: Y X Y 2 Y 3 X 2 X 3 Y X Plo N sequece of rolls Erc Xg 4 2

Eamle: he Dshoes Caso caso has wo dce: Far de P P2 P3 P5 P6 /6 Loaded de P P2 P3 P5 /0 P6 /2 Caso laer swches bac-&-forh bewee far ad loaded de oce ever 20 urs Game:. You be $ 2. You roll alwas wh a far de 3. Caso laer rolls mabe wh far de mabe wh loaded de 4. Hghes umber ws $2 Erc Xg 5 Puzzles Regardg he Dshoes Caso GIVEN: sequece of rolls b he caso laer 245526462464636366666466636663666366556554623562344 QUESION How lel s hs sequece gve our model of how he caso wors? hs s he EVLUION roblem HMMs Wha oro of he sequece was geeraed wh he far de ad wha oro wh he loaded de? hs s he DECODING queso HMMs How loaded s he loaded de? How far s he far de? How ofe does he caso laer chage from far o loaded ad bac? hs s he LERNING queso HMMs Erc Xg 6 3

Sochasc Geerave Model Observed sequece: 4 3 6 6 4 B Hdde sequece a arse or segmeao: B B B Erc Xg 7 Defo of HMM Observao sace lhabec se: Eucldea sace: C { c c 2 L c K } d R Ide se of hdde saes I { 2 LM} raso robables bewee a wo saes j a or j ~ Mulomal a a 2 K a M I Sar robables ~ Mulomal π π 2 K π M. Emsso robables assocaed wh each sae or geeral:. ~ Mulomal b b K b. 2 K I θ. ~ f I 2 3 2 3 Grahcal model K 2 Sae auomaa Erc Xg 8 4

Probabl of a Parse Gve a sequece ad a arse o fd how lel s he arse: gve our HMM ad he sequece 2 3 2 3 Jo robabl 2 2 2 - P 2-2 2 π M Margal robabl: Poseror robabl: def M + j j [ j ] + def def Le π [ ] a a π ad b [ ] b a La 2 b Lb Erc Xg 9 M K L π a 2 N 2 / he Dshoes Caso Model 0.95 0.05 0.95 FIR LODED P F /6 P2 F /6 P3 F /6 P4 F /6 P5 F /6 P6 F /6 0.05 P L /0 P2 L /0 P3 L /0 P4 L /0 P5 L /0 P6 L /2 Erc Xg 0 5

Eamle: he Dshoes Caso Le he sequece of rolls be: 2 5 6 2 6 2 4 he wha s he lelhood of Far Far Far Far Far Far Far Far Far Far? sa al robs a 0Far ½ a oloaded ½ ½ P Far PFar Far P2 Far PFar Far P4 Far ½ /6 0 0.95 9.0000000052586472 5.2 0-9 Erc Xg Eamle: he Dshoes Caso So he lelhood he de s far all hs ru s jus 5.2 0-9 OK bu wha s he lelhood of π Loaded Loaded Loaded Loaded Loaded Loaded Loaded Loaded Loaded Loaded? ½ P Loaded PLoaded Loaded P4 Loaded ½ /0 8 /2 2 0.95 9.00000000078787625 0.79 0-9 herefore s afer all 6.59 mes more lel ha he de s far all he wa ha ha s loaded all he wa Erc Xg 2 6

Eamle: he Dshoes Caso Le he sequece of rolls be: 6 6 5 6 2 6 6 3 6 Now wha s he lelhood π F F F? ½ /6 0 0.95 9 0.5 0-9 same as before Wha s he lelhood L L L? ½ /0 4 /2 6 0.95 9.0000004923823534735 5 0-7 So s 00 mes more lel he de s loaded Erc Xg 3 hree Ma Quesos o HMMs. Evaluao GIVEN a HMM M ad a sequece FIND Prob M LGO. Forward 2. Decodg GIVEN a HMM M ad a sequece FIND he sequece of saes ha mamzes e.g. P M or he mos robable subsequece of saes LGO. Verb Forward-bacward 3. Learg GIVEN a HMM M wh usecfed raso/emsso robs. ad a sequece FIND arameers θ π a j η ha mamze P θ LGO. Baum-Welch EM Erc Xg 4 7

lcaos of HMMs Some earl alcaos of HMMs face bu we ever saw hem seech recogo modellg o chaels I he md-lae 980s HMMs eered geecs ad molecular bolog ad he are ow frml ereched. Some curre alcaos of HMMs o bolog mag chromosomes algg bologcal sequeces redcg sequece srucure ferrg evoluoar relaoshs fdg gees DN sequece Erc Xg 5 cal srucure of a gee Erc Xg 6 8

GENSCN Burge & Karl 5'UR Forward + srad Reverse - srad E0 E E2 I 0 I I 2 E romoer E s ergec rego E ol- 3'UR Forward + srad Reverse - srad θ θ θ θ 2 3 4 GGCGGGGGGGGCGCCGCGCCGC CGGGCCCGCGCCGCCGCGCCGG GGCGCCCGCCGCC GCGCCCCGCCGGCCCCCC CCGCGCGGCGGGGGCGCC CGGCCGGCCCGCGCGCGC GCGCGGCGCGCG GCGGCCGCCGGGGCGCCGCGCCC GCGCCCGGCCCCCCGCCGGC CCGGCGGGCGGCCGCGGCGGCGCC GGGGGGGCGCGGCCGCGCGCCG CGGCGGGCGCGGGCCGCCCC GCGGGGGCCCGCCCCGCG GGCGGGCCCCG CGGCGGGCGCCGCGGGC CCCGGGCCCGCCCGCGCGCCC GCGGCCGCGCCCCCCCCCCG CCCCGCCCGCCGCCCCCCCG GCCGCGGGCGGCGCGGGCCGC GGGCGCGCCGCC CGGGCGCCCGG GGGCGGCCGCGG GCGGGCCCGG GGGCGCCGG GGGCGCGGGGCGGCCG CCGGCGGCGG GGCGCGCCGGGCGC GGGGCCCGGGGCGC GGGCGGGCGGGGC GCCCCCCG GCCCCGGCCGG GGCGCGGCGG CCGCGGGCCGGCGG CCCCGCGCGCCGCGCC GCGCCGCGC CCGGCCC CCCCGCCGGGCGCCC CCGGGGGCGCCCCGGG GCGGCCGGCGCCCCGCGG Erc Xg 7 he HMM lgorhms Quesos: Evaluao: Wha s he robabl of he observed sequece? Forward Decodg: Wha s he robabl ha he sae of he 3rd roll s loaded gve he observed sequece? Forward- Bacward Decodg: Wha s he mos lel de sequece? Verb Learg: Uder wha arameerzao are he observed sequeces mos robable? Baum-Welch EM Erc Xg 8 9

he Forward lgorhm We wa o calculae P he lelhood of gve he HMM Sum over all ossble was of geerag : 2 N 2 L π a o avod summg over a eoeal umber of ahs defe α def α P he forward robabl he recurso: α P α α a Erc Xg 9 he Forward lgorhm dervao Comue he forward robabl: - α P P P P P P P P P P P P α a - Cha rule : P B C P P B C P C B Erc Xg 20 0

he Forward lgorhm α We ca comue for all usg damc rogrammg! Ialzao: Ierao: α P π α P P P P π α P α a ermao: P α Erc Xg 2 he Bacward lgorhm We wa o comue P he oseror robabl dsrbuo o he h oso gve We sar b comug P P + P P + P P + + + he recurso: Forward α Bacward β a + + β + β P + Erc Xg 22

he Bacward lgorhm dervao Defe he bacward robabl: + β P + P + + + + + + + 2 + + P P + + + + 2 + P P β a + + + + Cha rule : P B C α P α P B C α P C B α Erc Xg 23 he Bacward lgorhm β We ca comue for all usg damc rogrammg! Ialzao: β Ierao: β β ermao: a P + + + P α β Erc Xg 24 2

Poseror decodg We ca ow calculae P α β P P P he we ca as Wha s he mos lel sae a oso of sequece : * Noe ha hs s a MP of a sgle hdde sae wha f we wa o a MP of a whole hdde sae sequece? Poseror Decodg: arg ma P hs s dffere from MP of a whole sequece of hdde saes P 0 0 0.35 hs ca be udersood as b error rae 0 0.05 vs. word error rae Eamle: MP of X? 0 0.3 MP of X Y? 0.3 Erc Xg 25 * { : } L Verb decodg GIVEN we wa o fd such ha P s mamzed: Le V * argma P argma π P ma } P - - { - Probabl of mos lel sequece of saes edg a sae he recurso: 2 3.. N Sae 2 V ma a V Uderflows are a sgfca roblem K K π a La b Lb K 2 hese umbers become eremel small uderflow Soluo: ae he logs of all values: log V log a + V V + ma Erc Xg 26 3

he Verb lgorhm dervao Defe he verb robabl: V + ma } P + + { ma { } P + + ma } P P { P + + + ma P + ma{ } P ma P + + a V P + + ma a V Erc Xg 27 he Verb lgorhm Iu: Ialzao: V Ierao: V P ma a V * P ma V racebac: P π Pr arg ma a V ermao: * arg ma V * * Pr Erc Xg 28 4

Comuaoal Comle ad mlemeao deals Wha s he rug me ad sace requred for Forward ad Bacward? α α a β a + + β+ V ma a V me: OK 2 N; Sace: OKN. Useful mlemeao echque o avod uderflows Verb: sum of logs Forward/Bacward: rescalg a each oso b mullg b a cosa Erc Xg 29 Learg HMM: wo scearos Suervsed learg: esmao whe he rgh aswer s ow Eamles: GIVEN: GIVEN: a geomc rego 000000 where we have good eermeal aoaos of he CG slads he caso laer allows us o observe hm oe eveg as he chages dce ad roduces 0000 rolls Usuervsed learg: esmao whe he rgh aswer s uow Eamles: GIVEN: GIVEN: he orcue geome; we do ow how freque are he CG slads here eher do we ow her comoso 0000 rolls of he caso laer bu we do see whe he chages dce QUESION: Udae he arameers θ of he model o mamze P θ --- Mamal lelhood ML esmao Erc Xg 30 5

Suervsed ML esmao Gve N for whch he rue sae ah N s ow Defe: j # mes sae raso j occurs # mes sae ems B We ca show ha he mamum lelhood arameers θ are: a ML j b ML # j # # # j 2 2 Homewor! Wha f s couous? We ca rea : : : N as N observaos of e.g. a Gaussa ad al learg rules for Gaussa Homewor! Erc Xg 3 j j ' ' B B j ' ' { } Suervsed ML esmao cd. Iuo: Whe we ow he uderlg saes he bes esmae of θ s he average frequec of rasos & emssos ha occur he rag daa Drawbac: Gve lle daa here ma be overfg: P θ s mamzed bu θ s ureasoable 0 robables VERY BD Eamle: Gve 0 caso rolls we observe 2 5 6 2 3 6 2 3 F F F F F F F F F F he: a FF ; a FL 0 b F b F3.2; b F2.3; b F4 0; b F5 b F6. Erc Xg 32 6

Pseudocous Soluo for small rag ses: dd seudocous j B # mes sae raso j occurs + R j # mes sae ems + S R j S j are seudocous rereseg our ror belef oal seudocous: R Σ j R j S Σ S --- "sregh" of ror belef --- oal umber of magar saces he ror Larger oal seudocous srog ror belef Small oal seudocous: jus o avod 0 robables --- smoohg Erc Xg 33 Usuervsed ML esmao Gve N for whch he rue sae ah N s uow EXPECION MXIMIZION 0. Sarg wh our bes guess of a model M arameers θ:. Esmae j B he rag daa j How? j B How? homewor 2. Udae θ accordg o j B Now a "suervsed learg" roblem 3. Reea & 2 ul covergece hs s called he Baum-Welch lgorhm We ca ge o a rovabl more or equall lel arameer se θ each erao Erc Xg 34 7

8 Erc Xg 35 he Baum Welch algorhm he comlee log lelhood he eeced comlee log lelhood EM he E se he M se "smbolcall" decal o MLE c 2 log log ; θ l + + j j c b a 2 log log log ; θ π l γ j j j ξ j ML j a 2 γ ξ ML b γ γ N ML γ π Erc Xg 36 he Baum-Welch algorhm -- commes me Comle: # eraos OK 2 N Guaraeed o crease he log lelhood of he model No guaraeed o fd globall bes arameers Coverges o local omum deedg o al codos oo ma arameers / oo large model: Over-fg