Lecture 10: Expectation-Maximization Algorithm

Size: px

Start display at page:

Download "Lecture 10: Expectation-Maximization Algorithm"

Claud O’Neal’
6 years ago
Views:

ECE 645: Estmaton Theory Sprng 2015 Instructor: Prof Stanley H Chan Lecture 10: Expectaton-Maxmzaton Algorthm (LaTeX prepared by Shaobo Fang) May 4, 2015 Ths lecture note s based on ECE 645 (Sprng

1 ECE 645: Estmaton Theory Sprng 2015 Instructor: Prof Stanley H Chan Lecture 10: Expectaton-Maxmzaton Algorthm (LaTeX prepared by Shaobo Fang) May 4, 2015 Ths lecture note s based on ECE 645 (Sprng 2015) by Prof Stanley H Chan n the School of Electrcal and Computer Engneerng at Purdue Unversty 1 Motvaton Consder a set of data ponts wth ther classes labeled, and assume that each class s a Gaussan as shown n Fgure 1(a) Gven ths set of data ponts, fndng the means of two Gaussan can be done easly by estmatng the sample mean, as the class labels are known Now magne that the classes are not labeled as shown n Fgure 1(b) How should we determne the mean for each of the classes then? In order to solve ths problem, we could use an teratve approach: frst make a guess of the class label for each data pont, then compute the means and update the guess of the class labels agan We repeat untl the means converge The problem of estmatng parameters n the absence of labels s known as unsupervsed learnng There are many unsupervsed learnng methods We wll focus on the Expectaton Maxmzaton (EM) algorthm 8 Class Labelled 8 Class Unlabelled Class 1 Class Fgure 1: Estmaton of parameters becomes trval gven the labelled classes 2 The EM-algorthm Notatons 1 Y, y observatons Y random varable; y realzaton of Y 2 X, x complete data 3 Z, z, mssng data Note that X (Y,Z) 4 θ: unknown determnstc parameter θ (t) : t th estmate of the θ n the EM teraton 5 f(y θ) s the dstrbuton of Y gven θ

2 6 f(x θ) s a random varable takng value of f(x θ) (Remember: f( θ) s a functon and thus we can put any argument nto f( θ) and evaluate ts output) 7 E X y,θ [g(x)] g(x)f X y,θ (x y,θ)dx s the condtonal expectaton of g(x) gven Y y and θ 8 l(θ) logf(y θ) s the log-lkelhood Note that l(θ) depends on y EM Steps The EM-algorthm conssts of two steps: 1 E-step: Gven y and pretendng for the moment that θ (t) s correct, formulate the dstrbuton for the complete data x: f(x y,θ (t) ) Then, we calculate the Q-functon: Q(θ θ (t) ) def E X y,θ (t)[logf(x θ)] logf(x θ)f(x y,θ (t) )dx 2 M-step: Maxmze Q(θ θ (t) ) wth regard to θ: Propertes of Q(θ θ (t) ) θ (t+1) argmaxq(θ θ (t) ) θ 1 Ideally, f we have the dstrbuton of the complete data x, then fndng the parameter can be done by maxmzng f(x θ) However, the complete data s only a vrtual thng we created to solved the problem In realty we never know x All we know s ts dstrbuton f(x θ), whch depends on what we know about x So one way to handle ths uncertanty s to compute the average Ths average s the Q-functon 2 Another way of lookng at Q(θ θ (t) ) We can treat logf(x θ) as a functon of two varables h(x,θ) Maxmzng over θ s problematc because t depends on X So by takng expectaton E X [h(x,θ)] we can elmnate the dependency on X 3 Q(θ θ (t) ) can be thought of a local approxmaton of the log-lkelhood functon l(θ): Here, by local we meant that Q(θ θ (t) ) stays close to ts prevous estmate θ (t) In fact f Q(θ θ (t) ) Q(θ (t) θ (t) ), then l(θ) l(θ (t) ) 3 Estmatng Mean wth Partal Observaton Let us consder the frst example of the EM algorthm Suppose that we generated a sequence of n random varables Y N(θ,σ 2 ) for 1,,n Imagne that we have only observed Y [Y 1,Y 2,,Y m ] where m < n How should we estmate θ based on Y? Intutvely, the estmated θ should be the sample mean of the m observatons θ 1 m m Y However, n ths example we would lke to derve the EM algorthm and see f the EM algorthm would match wth our ntuton Soluton: To start the EM algorthm, we frst need to specfy the mssng data and the complete data In ths problem, the mssng data s Z [Y m+1,,y n ], and the complete datas X [Y,Z] The dstrbuton of X s: logf(x θ) n (Y θ) 2 2 log(2πσ2 ) 2σ 2 (1) 2

3 Therefore, the Q functon s Q(θ θ (t) ) def E X Y,θ (t)[logf(x θ)] [ n m (Y θ) 2 E X Y,θ (t) 2 log(2πσ2 ) 2σ 2 n 2 log(2πσ2 ) The last expectaton can be evaluated as Therefore, the Q functon s Q(θ θ (t) ) n 2 log(2πσ2 ) m (y θ) 2 2σ 2 m+1 m+1 E Y Y,θ (t)[(y θ) 2 ] E Y Y,θ (t)[y 2 2Y θ +θ 2 ] [(θ (t) ) 2 +σ 2 2θ (t) θ +θ 2 ] m (y θ) 2 2σ 2 ] (Y θ) 2 2σ 2 E X Y,θ (t)[(y θ) 2 ] 2σ 2 n m 2σ 2 [(θ(t) ) 2 +σ 2 2θ (t) θ +θ 2 ] In the M-step, we need to maxmze the Q-functon To ths end, we set whch yelds that θ Q(θ θ(t) ) 0, m θ (t+1) y +(n m)θ (t) n It s not dffcult to show that as t, θ (t) θ ( ) Hence, m θ ( ) y ( + 1 m ) θ ( ), n n whch yelds θ ( ) 1 m m y Ths result says that as the EM algorthm converges, the estmated parameter converges to the sample mean usng the avalable m samples, whch s qute ntutve 4 Gaussan Mxture Wth Known Mean And Varance Our next example of the EM algorthm to estmate the mxture weghts of a Gaussan mxture wth known mean and varance A Gaussan mxture s defned as f(y θ) θ N(y µ,σ), 2 (2) where θ [θ 1,,θ k ] s called the mxture weght The mxture weght satsfes the condton that Our goal s to derve the EM-algorthm for θ θ 1 3

4 Soluton: We frst need to defne the mssng data For ths problem, we observe that the observed data s Y [y 1,y 2,,y n ] The mssng data can be defned as the label for each y j, so that Z [Z 1,Z 2,,Z n ], wth Z j {1,,k} Consequently, the complete data s X [X 1,X 2,,X n ], where X j (y j,z j ) The dstrbuton of the complete data can be computed as Thus, the Q functon s The expectaton can be evaluated as f(x j θ) f(y j,z j θ) θ zj N(y j µ zj,σ 2 z j ), Q(θ θ (t) ) E X, Y,θ (t) {logf(x, θ)} E Z, y,θ (t) {logf(z,y, θ)} n E Z, y,θ (t) log θ zj N(y j, µ zj,σz 2 j ) j1 } E Zj y j,θ {logθ (t) zj +logn(y j, µ zj,σz 2 j ) j1 E Zj y j,θ (t){logθ z j } z j logθ zj P(Z j z j y j,θ (t) ) By summng over all j s, we can further defne Therefore, the Q functon becomes j1 j logθ P(Z j y j,θ (t) ) }{{} P(Z j y j,θ (t) ) j1 j1 Q(θ θ (t) ) def j θ (t) N(y j µ,σ 2 ) θ(t) N(y j µ,σ 2 ) j1 log j θ +C log θ +C, for some constant C ndependent of θ Maxmzng over θ yelds θ (t+1) argmax θ γ(t), logθ where the last equalty s due to Gbbs nequalty To summarze the EM algorthm s gven n the algorthm below 4

5 Data: Gaussan Mxture wth known mean and varance Result: Estmated θ for t 1, do end j1 θ (t) γ(t) θ (t) N(y j µ,σ 2 ) θ(t) N(y j µ,σ 2 ) Remark: To solve argmax θ all α and β such that α 1, γ(t) logθ, we use the Gbbs nequalty Gbbs nequalty states that for β 1, 0 α 1 and 0 β 1, t holds that α logβ α logα, (3) wth the equalty holds when α β for all The proof of Gbbs nequalty s due to the non-negatvty of the KL-dvergence whch we wll skp What we want to show s that f we let then the equalty holds when: whch s the result we want α γ(t) θ, β θ, γ(t), 5 Gaussan Mxture Prevously we have been workng on Gaussan Mxtures wth known mean and varance However for most of the tme t s lkely nether mean nor varance s avalable for us Thus, we are nterested n dervng an EM-algorthm that would generally apply for any Gaussan mxture model wth only observatons avalable Recall that a Gaussan mxture s defned as f(y θ) π N(y µ,σ ), (4) where θ def {(π µ Σ )} k s the parameter, wth π 1 Our goal s to derve the EM algorthm for learnng θ Soluton We frst specfy the followng data: Observed Data: Y [Y 1,,Y n ] wth realzatons y [y 1,,y n ]; Mssng Data: Z [Z 1,,Z n ] wth realzatons z [z 1,,z n ], where z j {1,,k}; Complete Data: X [X 1,,X n ] wth realzatons x [x 1,,x n ] and x j (y j,z j ) Accordngly, the dstrbuton of the complete data s f(y j,z j θ) π zj N(y j µ zj,σ zj ) 5

6 Therefore, we can show that The Q functon s P(Z j y j,θ (t) ) π (t) N(y j µ (t),σ (t) ) π(t) N(y µ (t),σ (t) ) Q(θ,θ (t) ) E X y,θ (t){logf(x θ)} E Z y,θ (t){logf(z,y θ)} n E Z y,θ (t){log( π zj N(y j µ zj,σ zj ))} j1 j1 j1 E Zj y j,θ (t){logπ z j 1 2 log Σ z j 1 2 (y j µ zj ) T Σ 1 z j (y j µ zj )}+C j1 P(Z j y,θ (t) ){logπ 1 2 log Σ 1 2 (y j µ )T Σ 1 (y j µ )}+C j {logπ 1 2 log Σ 1 2 (y j µ ) T Σ 1 (y j µ )}+C, where C s a constant ndependent of θ The Maxmzaton step s to solve the followng optmzaton problem maxmze θ subject to Q(θ θ (t) ) π 1, π > 0, Σ 0 (5) For π, the maxmzaton s maxmze π subject to j1 γ(t) j logπ π 1, π > 0 (6) The soluton of ths problem s π (t+1) j1 γ(t) j n j1 γ(t) j For µ, the maxmzaton can be reduced to solvng the equaton j1 γ(t) j (7) n µ Q(θ θ (t) ) 0 (8) The left hand sde s Therefore, µ Q(θ θ (t) ) µ { Σ 1 ( j1 j1 j (y j µ )T Σ 1 (y j µ )} j y j j1 µ (t+1) j1 γ(t) j y j1 γ(t) j j µ ) (9) 6

7 For Σ, the maxmzaton s equvalent to solvng The left hand sde s Σ (θ θ (t) ) 1 2 (Σn j1 γ(t) j )log Σ Σ ( n γ t j )Σ Σ (θ θ (t) ) 0 (10) j1 j1 j {(y Σ j µ ) T Σ 1 (y j µ )} j Σ 1 (y j µ )(y j µ ) T Σ 1 Therefore, Σ t+1 j1 γ(t) j (y j µ (t+1) )(y j µ (t+1) γt j ) T (11) 6 Bernoull Mxture Our next example s to consder a Bernoull mxture model To motvate ths problem, let us magne that we have a dataset of varous tems Our goal s to see whether there s any relatonshp between the presence or absence of these tems For example, f the object A (eg a tree) was presented, there s some probablty that the object B (eg a flower) s also presented However f gven certan object C (eg a dnosaur) presented t s unlkely to see the object D (eg a car, unless you are n Jurassc Park!) To setup the problem let us frst defne some notatons We use Y 1,,Y N to denote N mages we have observed In each mage, there are at most M tems, so that Y n [Y1 n,,ym n ] for n 1,,N Each entry n ths vector s a Bernoull random varable Moreover, we defne P(Y n 1 Y n k 1) def θ k (12) Therefore, the goal s to estmate the matrx Θ θ 11 θ M1 θ 1M (13) θ MM from the observatons Y 1,,Y N The general problem of estmatng Θ from Y 1,,Y N s very dffcult Therefore, t s necessary to pose some assumptons on the problem The assumpton we make here s sem-vald from our daly experence It s not completely true, but they are smple enough to provde us some computatonal solutons Assumpton 1 Condtonal Independence We assume that the observatons follow the condtonal ndependence structure: P(Y n 1 Yj n 1 Yk n n 1) P(Y 1 Yk n 1) P(Y j n 1 Yk n 1) (14) Remark: Condtonal ndependence s not the same as ndependence For example, we let A be the event that a puppy breaks a toy, B be the event that a mother yells, and C be the event that a chld cres Wthout knowng the relatonshp, t could be that the chld cres because the mother yells However, f we assume the condtonal ndependence of B and C gven A, then we know that the cryng of the chld and the yellng of the mother are both trggered by the dog, but not by each other 7

8 Indvdual Model In order to understand the EM algorthm of Bernoull Mxture, let us set n fxed Consequently, Furthermore, P(Y n y n ) P(Y n y n tem m s actve )P( tem m s actve ) }{{} m1 def π m P(Y n y n tem m s actve ) where θ m [θ m1,,θ mm ] s the mth row of Θ Therefore, M θ yn m (1 θ m) 1 yn def f m (y n θ m ), P(Y n y n ) m1 π m f m (y n θ m ) (15) EM Algorthm Now, we wll derve EM algorthm to estmate {π 1,,π M } and Θ To start wth, let us defne the followng types of data: Observed Data: Y 1,,Y N ; Mssng Data: Z 1,,Z N wth realzatons z 1,,z N and z n R 1 N ; Complete Data: X 1,,X N, accordngly x n (y n,z n ) The dstrbuton of the complete data s P(Y n y n,z n z n Θ) π m f m (y n θ m ) The dstrbuton of the mssng data condtoned on the observed data s The nth Q functon s π (t) P(Z n m Y n y n,θ (t) m f m (y ) n θ (t) m ) M m1 π(t) m f m (y n θ (t) m ) where we can show that Q n (Θ Θ (t) ) def E Zn y n,θ (t)[logf(x n Θ)] E Zn y n,θ (t)[logf(z n,y n Θ)] log(π m f m (y n θ (t) m ))P(Z n m y n,θ (t) ) }{{} m1 m1 log(π m f m (y n θ (t) m )) logπ m +log nm log(π mf m (y n θ (t) m )), logπ m + M def j θ yn m (1 θ m) 1 yn y n logθ m +(1 y n )log(1 θ m) 8

9 Therefore, overall Q-functon s Q(Θ Θ (t ) γ nm (t) n1m1 To maxmze the Q functon, we solve [ logπ m + ] y n logθ m +(1 y n )log(1 θ m ) (16) For a fxed m and, we have Settng ths to zero yelds Θ (t1) argmaxq(θ Θ (t) ) (17) Θ θ m Q(Θ Θ (t) ) N n1 [ ] y γ nm (t) n 1 yn θ m 1 θ m whch s N n1 γ(t) nmy n N n1 γ(t) nm(1 y n), θ m 1 θ m N θ (t+1) n1 m γ(t) Data: EM Algorthm for Bernoull Mxture Model Result: Estmated Θ and π m for t 1, do end 7 Convergence of EM nm nmy N n1 γ(t) nm π (t) M m1 π(t) m f m (y n θ (t) N θ (t+1) n1 m γ(t) nmy n N n1 γ(t) nm π (t+1) m nm N n1 γ(t) nm (18) m ) m f m (y n θ (t) The convergence of EM algorthm s known to be local What t means s that as the EM algorthm terates, θ (t+1) wll never be less lkely than θ (t) Ths property s called the monotoncty of EM, whch s the result of the followng theorem Theorem 1 Let X and Y be two random varables wth parametrc dstrbuton controlled by a parameter θ Λ Suppose that: 1 X does not depend on θ; 2 There exsts a Markov relatonshp θ X Y e f(y x,θ) f(y x) for all θ Λ and x X, y Y Then, for θ Λ and y Y such that X(y), we have: m ) l(θ) l(θ (t) ) f Q(θ θ (t) ) Q(θ (t) θ (t) ) (19) 9

10 Proof l(θ) logf(y θ) (by defnton) log f(x, y θ)dx (margnalzaton, e, total probablty) X(y) f(x,y θ) log X(y) f(x y,θ (t) ) f(x y,θ(t) )dx [ ] f(x,y θ) loge X y,θ (t) f(x y,θ) [ E X y,θ (t) log f(x,y θ) ] (Jensen s Inequalty) f(x y,θ) E X y,θ (t) log f(y X,θ)f(X θ) (Baye s Rule) f(y X,θ (t) )f(x θ (t) ) f(y θ (t) ) [ ] E X y,θ (t) log f(y X)f(X θ)f(y θ(t) ) f(y X)f(X θ (t) ) [ ] E X y,θ (t) log f(x θ)f(y θ(t) ) f(x θ (t) ) E X y,θ (t) [logf(x θ)] E X y,θ (t) Q(θ θ (t) ) Q(θ (t) θ (t) )+logf(y θ (t) ) }{{} l(θ (t) ) (assumpton 2) [ ] [ ] logf(x θ (t) ) +E X y,θ (t) logf(y θ (t) ) Thus, l(θ) l(θ (t) ) Q(θ θ (t) ) Q(θ (t) θ (t) ) Hence f Q(θ θ (t) ) Q(θ (t) θ (t) ), then l(θ) l(θ (t) ) 8 Usng Pror wth EM The EM algorthm can fal due to sngularty of the log-lkelhood functon For example, when learnng a GMM wth 10 components, the algorthm may decde that the most lkely soluton s for one of the Gaussans to only have one data pont assgned to t Ths could yeld some bad result of havng zero covarance To allevate ths problem, one can use the pror nformaton about θ In ths case, we can modfy the EM setp as E-step: Q(θ θ (t) ) E X y,θ (t)[logf(x θ)]; M-step: θ (t+1) argmax θ Q(θ θ (t) )+logf(θ) }{{} pror Example Assume that we have a GMM of k-components: f(y j θ) w N(y j µ,σ 2 ) (20) 10

11 Let us consder a constrant on µ : µ µ+( 1) µ, for 1,,k, e the means are equally spaced (For detals please refer to secton 33 of Gupta and Chen Prors: We assume the followng prors: 1 2 That s, That s, ( ) v σ 2 nverse-gamma 2, ǫ2 2 2 f(σ 2 ) (ξ 2 )v 2 Γ( v 2 ) (σ2 ) v 2 1 exp (σ 2 ) v+3 2 exp µ σ 2 N f( µ σ 2 ) exp ( ( ξ2 ( ξ2 2σ 2 ) 2σ 2 ( η, σ ) ρ ( µ η)2 2( σ2 ρ ) ) ) Therefore, the jont dstrbuton of the pror s: f( µ,σ 2 ) (σ 2 ) v+3 2 exp { ξ2 +l( η) 2 2σ 2 } (21) Parameters: θ (w 1,,w k,µ, µ,σ 2 ) Our goal s to estmate θ EM algorthm: Frst of all, we let j The EM steps can be derved as follows The Expectaton Step Q(θ θ (t) ) j1 j1 j1 j log(w N(y j µ,σ 2 )) w (t) N(y j µ (t),σ 2(t) ) w(t) N(y j µ (t),σ 2(t) ) (22) j log(w N(y j µ+( 1) µ,σ 2 )) j logw n 2 log(2π) n 2 log(σ2 ) 1 2σ 2 j1 j (y j µ ( 1) µ) 2 11

12 The Maxmzaton Step θ (t+1) argmax θ argmax θ 1 2σ 2 Q(θ θ (t) )+logf(θ) j1 j1 j logw n+v +3 logσ 2 ξ +l( µ η)2 2 2σ 2 j (y j µ ( 1) µ) 2 +C Thus, and w (t+1) j1 γ(t) j j1 γ(t) j, { µ [Q(θ θ(t) )+logf(θ)] 0 µ [Q(θ θ(t) )+logf(θ)] 0 [ 1 w(t+1) w(t+1) +1 1 w(t+1) l n ] [ ] [ µ µ 1 ρη n + 1 n j1 n j1 y j The soluton of µ and µ can be obtaned by solvng the lnear system Fnally, ( ) σ 2 Q(θ θ (t) )+logf(θ) 0 σ 2(t+1) ξ2 +l( µ (t+1) η) 2 + j1 n+v +3 γ(t) j (y j µ (t+1) 2 γ(t) j ( 1)y j ) 2 9 MALAB Demo: EM Algorthm for Bernoull Mxture 91 Synthesze The Data ] 1 functon [ data rand] MakeData( DS, u vec, p mat ) 2 3 cnt 0; 4 for 1:1:length(u vec) 5 N DS*u vec(); 6 p vec p mat(,:); 7 %% 8 for m 1:1:length(p vec) 9 data vec randperm(n); 10 th N*p mat(,m); 11 for n 1:1:N 12 f data vec(n) > th 13 data vec(n) 0; 14 else 15 data vec(n) 1; 16 end 17 end 18 data(cnt+1:cnt+n, m) data vec'; 19 end 20 cnt cnt + N; 21 end %% Now randomly permutate the rows of the matrx 24 [row, column] sze(data); 25 row vec randperm(row); 12

13 26 for 1:1:row 27 randtemp row vec(); 28 data rand(,:) data(randtemp,:); 29 end end 92 Estmate the probablty of a Vector Gven Bernoull Dstrbuton 1 functon [ p b ] Bernoull vec( p vec, y vec ) 2 %% Calculate the probablty of usng the current Bernoull Mxture 3 p b 1; 4 for 1:1:length(p vec) 5 p b p b*(p vec()ˆ(y vec()))*((1-p vec())ˆ(1-y vec())); 6 end 7 8 end 93 The Man Functon for EM wth Bernoull Mxture 1 close all 2 clear all 3 clc 4 DS nput('eneter the synthetzed data sze:'); 5 u vec [1/4, 1/2, 1/4] 6 p mat [1, 04, 005; 7 02, 1, 08; 8 03, 07, 1] 9 data rand MakeData(DS, u vec, p mat); 10 T nput('enter the desred number of teratons:'); %% Pck Intalzaton of parameters 13 u ntal [1/4, 1/8, 5/8]; 14 p ntal [03, 02, 08; 15 01, 08, 07; 16 05, 015, 06]; M length(u ntal); 19 N sze(data rand, 1); 20 % Intlaze the parameters 21 u u ntal; 22 p p ntal; u hstory zeros(m,t); 25 p hstory zeros(m,m,t); for t 1:1:T 28 for m 1:1:M 29 p m u(m); 30 p vec p(m,:); 31 for n 1:1:N 32 y vec data rand(n,:); 33 %% Fnd the Hdden Varable, lambda 34 numerator p m*bernoull vec(p vec, y vec); % Modle the Bernoull Process 35 denom 0; 36 for mm 1:1:M 37 p vec tmp p(mm,:); 38 denom denom + u(mm)*bernoull vec(p vec tmp, y vec); 39 end 40 lambda(m,n) numerator/denom; 41 end 13

14 42 end sum lambda sum(sum(lambda)); %% Update mu 47 for m 1:1:M 48 u(m) sum(lambda(m,:))/sum lambda; 49 end %% Update P matrx 52 for 1:1:M 53 for m 1:1:M 54 p(m,) (sum(lambda(m,:)*data rand(:,)'))/(sum(lambda(m,:))); 55 end 56 end %% Save n hstory for each teraton to plot 59 u hstory(:,t) u; 60 p hstory(:,:,t) p; 61 end 62 dsp('updated p and u:') 63 p 64 u fgure 67 hold on 68 grd on 69 for m 1:1:M 70 plot(u hstory(m,:)); 71 end 72 ylabel('estmated \mu value', 'FontSze', 20) 73 xlabel('iteratons', 'FontSze', 20) 74 ttle('convergence of \mu estmated for Mxture Number 3', 'FontSze', 20) 75 for m 1:1:M 76 stem(t, u vec(m)); 77 end fgure 81 hold on 82 grd on 83 for 1:1:M 84 for jj 1:1:M 85 for t 1:1:T 86 tmp p hstory(,jj,t); 87 plot vec(t) tmp; 88 end 89 plot(plot vec) 90 end 91 end 92 ylabel('estmated P matrx values', 'FontSze', 20) 93 xlabel('iteratons', 'FontSze', 20) 94 ttle('convergence of P matrx estmated for Mxture Number 3', 'FontSze', 20) 95 for m 1:1:M 96 for n 1:1:M 97 stem(t, p mat(m,n)); 98 end 99 end for m 1:1:M 102 one loc fnd(abs(p(m,:) - 1) mn(abs(p(m,:) - 1))) 103 p fnal(one loc,:) p(m,:); 104 u fnal(one loc) u(m); 105 end dsp('after Automatc Sortng Based on Dagnals:') 108 p fnal 109 u fnal 14

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons