Graphical Models 101. Biostatistics 615/815 Lecture 12: Hidden Markov Models. Example probability distribution. An example graphical model

Size: px

Start display at page:

Download "Graphical Models 101. Biostatistics 615/815 Lecture 12: Hidden Markov Models. Example probability distribution. An example graphical model"

Avis Wheeler
5 years ago
Views:

1 101 Biostatistics 615/815 Lecture 12: Hidden Markov Models Hyun Min Kang February 15th, 2011 Marriage between probability theory and graph theory Each random variable is represented as vertex Dependency between random variables is modeled as edge Directed edge : conditional distribution Undirected edge : joint distribution Unconnected pair of vertices (without path from one to another) is independent A powerful tool to represent complex structure of dependence / independence between random variables Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 An example graphical model probability distribution!% 658(49$32":% % #$% &'()% *% ;(0<-4% >3<5$32% *+,,-% &/(+0-% 1% /<44% 6?3,0<,:3% 12343,5% &6743,5% 12@!A% 12@*B!A% 12@1B*A% Are H and P independent? Are H and P independent given S? Pr(H) Pr(S H) Value (H) Description (H) Pr(H) 0 Low 03 1 High 07 S Description (S) H Description (H) Pr(S H) 0 Cloudy 0 Low 07 1 Sunny 0 Low 03 0 Cloudy 1 High 01 1 Sunny 1 High 09 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

2 Probability distribution (cont d) Pr(P S) P Description (P) S Description (S) Pr(P S) 0 Absent 0 Cloudy 05 1 Present 0 Cloudy 05 0 Absent 1 Sunny 01 1 Present 1 Sunny 09 Full joint distribution Pr(H, S, P) H S P Pr(H, S, P) With a full join distribution, any type of inference is possible As the number of variables grows, the size of full distribution table increases exponentially Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Pr(H, P S) Pr(H S) Pr(P S) Pr(H, P S) Pr(H S), Pr(P S) H P S Pr(H, P S) H S Pr(H S) P S Pr(P S) Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 H and P are conditionally independent given S!% 658(49$32":% % #$% &'()% *% ;(0<-4% >3<5$32% *+,,-% &/(+0-% 1% /<44% 6?3,0<,:3% 12343,5% &6743,5% 12@!A% 12@*B!A% 12@1B*A% H and P do not have direct path one from another All path from H to P is connected thru S Conditioning on S separates H and P Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

3 Conditional independence in graphical models Markov Blanket '()!+" '()#*!+" #" '()$*#+" '()&*#+" '()%*#+" $" %" &" Pr(A, C, D, E B) Pr(A B) Pr(C B) Pr(D B) Pr(E B) Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 If conditioned on the variables in the gray area (variables with direct dependency), A is independent of all the other nodes A (U A π A ) π A Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hidden Markov Models - An An alternative representation of 346$ 345$ #!$ %&'$ 347$ 348$ ()*# +,-,*+# " # $ # % # & # -! $"# - "#! %$# - $#! &/&0"1# %#! &# 3466$ 3493$ ()**+$,-)/+$ 3493$ 3483$ -,-# 2 # 3 /' " 1 # 3!$ /' $ 1 # 3!% /' % 1 # 3!& /' & 1 # ' "# ' $# ' %# ' &# 3435$ 012*+$ 34:3$ Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

4 Marginal likelihood of data in Forward and backward probabilities Let λ (A, B, π) For a sequence of observation o {o 1,, o t }, Pr(o λ) Pr(o q, λ) Pr(q λ) q Pr(o q, λ) t t Pr(o i q i, λ) b qi (o i ) Pr(q λ) π q1 t Pr(o λ) q i1 i1 i2 a qi q i 1 π q1 b q1 (o q1 ) t a qi q i 1 b qi (o qi ) i2 q t (q 1,, q t 1 ), q + t (q t+1,, q T ) o t (o 1,, o t 1 ), o + t (o t+1,, o T ) Pr(q t i o, λ) Pr(q t i, o λ) Pr(q t i, o λ) Pr(o λ) n j1 Pr(q t j, o λ) Pr(q t, o λ) Pr(q t, o t, o t, o + t λ) Pr(o + t q t, λ) Pr(o t q t, λ) Pr(o t q t, λ) Pr(q t λ) Pr(o + t q t, λ) Pr(o t, o t, q t λ) β t (q t )α t (q t ) If α t (q t ) and β t (q t ) is known, Pr(q t o, λ) can be computed in a linear time Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 DP algorithm for calculating forward probability Conditional dependency in forward-backward algorithms Key idea is to use (q t, o t ) o t q t 1 Forward : (q t, o t ) o t q t 1 Backward : o t+1 o + t+1 q t+1 α t (i) Pr(o 1,, o t, q t i λ) Pr(o t, o t, q t 1 j, q t i λ) j1 Pr(o t, q t 1 j λ) Pr(q t i q t 1 j, λ) Pr(o t q t i, λ) j1 α t 1 (j)a ij b i (o t ) j1 α 1 (i) π i b i (o 1 ) "#$ % " % "&$ %! "#$%! "%! "&$% ' "#$% ' "% ' "&$% Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

5 DP algorithm for calculating backward probability Putting forward and backward probabilities together Key idea is to use o t+1 o + t+1 q t+1 β t (i) Pr(o t+1,, o T q t i, λ) Pr(o t+1, o + t+1, q t+1 j q t i, λ) β T (i) 1 j1 Pr(o t+1 q t+1, λ) Pr(o + t+1 q t+1 j, λ) Pr(q t+1 j q t i, λ) j1 β t+1 (j)a ji b j (o t+1 ) j1 Conditional probability of states given data Pr(q t i o, λ) Time complexity is Θ(n 2 T) Pr(o, q t S i λ) n j1 Pr(o, q t S j λ) α t (i)β t (i) n j1 α t(j)β t (j) Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Finding the most likely trajectory of hidden states The algorithm Given a series of observations, we want to compute Define δ t (i) as arg max Pr(q o, λ) q δ t (i) max Pr(q, o λ) q Use dynamic programming algorithm to find the most likely path Initialization δ 1 (i) πb i (o 1 ) for 1 i n Maintenance δ t (i) max j δ t 1 (j)a ij b i (o t ) ϕ t (i) arg max j δ t 1 (j)a ij Termination Max likelihood is max i δ T (i) Optimal path can be backtracked using ϕ t (i) Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

An example An example path When observations were (walk, shop, clean) Similar to Dijkstra s or Manhattan tourist algorithm Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, 2011 21 / 27

2(Biased)} Initial states : π {09, 01} ( ) 095 02 Transition probability : A(i, j) a ij 005 08 ( ) 05 09 Emission probability : B(i, j) b j (i) 05 01 Questions Given coin toss observations, estimate

$probability double trans[2][2] { {095,02}, {005,08} }; // trans[i][j] : j->i transition double emis[2][2] { {05,09}, {05,01} }; // emis[i][j] : b_j(o_i) double* alphas new double[t*2]; // forward$

6 An example An example path When observations were (walk, shop, clean) Similar to Dijkstra s or Manhattan tourist algorithm Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 A working example : Occasionally biased coin A generative Observations : O {1(Head), 2(Tail)} Hidden states : S {1(Fair), 2(Biased)} Initial states : π {09, 01} ( ) Transition probability : A(i, j) a ij ( ) Emission probability : B(i, j) b j (i) Questions Given coin toss observations, estimate the probability of each state Given coin toss observations, what is the most likely series of states? Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 implementations // assume that T is # of states, and o is array of coin toss (0/1) double pi[2] {09,01}; // initial 0/1 probability double trans[2][2] { {095,02}, {005,08} }; // trans[i][j] : j->i transition double emis[2][2] { {05,09}, {05,01} }; // emis[i][j] : b_j(o_i) double* alphas new double[t*2]; // forward probability (i,j)->(2*i+j) double* betas new double[t*2]; // forward algorithm alphas[0] pi[0] * emis[o[0]][0]; alphas[1] pi[1] * emis[o[0]][1]; for(int i1; i < T; ++i) { } alphas[i*2] (alphas[(i-1)*2] * trans[0][0] + // backward probability (i,j)->(2*i+j) alphas[(i-1)*2+1] * trans[0][1]) * emis[o[i]][0]; alphas[i*2+1] (alphas[(i-1)*2] * trans[1][0] + alphas[(i-1)*2+1] * trans[1][1]) * emis[o[i]][1]; Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

7 implmentations More s and beyond // backward algorithm betas[(t-1)*2] 1; betas[(t-1)*2+1] 1; for(int it-2; i > 0; --i) { betas[i*2] betas[(i+1)*2] * trans[0][0] * emis[o[i+1]][0] + betas[(i+1)*2+1] * trans[0][1] * emis[o[i+1]][1]; betas[i*2+1] betas[(i+1)*2] * trans[1][0] * emis[o[i+1]][0] + betas[(i+1)*2+1] * trans[1][1] * emis[o[i+1]][1]; } // summing forward-backward probabilities double* gammas new double[t*2]; for(int i0; i < T; ++i) { gammas[i*2] (alphas[i*2]*betas[i*2]); gammas[i*2+1] (alphas[i*2+1]*betas[i*2+1]); double z gammas[i*2]+gammas[i*2+1]; gammas[i*2] / z; gammas[i*2+1] / z; } Baum-Welch algorithm Estimate the transition and emission probabilities from data Iterative procedure to calculate the frequencies using E-M algorithm Will be introduced later Advanced graphical models Conditional random field - inference using undirected graphical model Bayesian network - inference from generalized graphical models Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27 Today - Hidden Markov Models Graphical models and conditional independence Forward-backward algorithm algorithm Implementations Next lectures Linear algebra Matrix decomposition Efficient matrix operations Hyun Min Kang Biostatistics 615/815 - Lecture 12 February 15th, / 27

Biostatistics 615/815 Lecture 20: The Baum-Welch Algorithm Advanced Hidden Markov Models

Biostatistics 615/815 Lecture 20: The Baum-Welch Algorithm Advanced Hidden Markov Models Hyun Min Kang November 22nd, 2011 Hyun Min Kang Biostatistics 615/815 - Lecture 20 November 22nd, 2011 1 / 31 Today