MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Size: px
Start display at page:

Download "MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)"

Transcription

1 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016

2 Recall 2/16 We are gven ndependent observatons (x (), z () ) wth mssng values z (). Let θ (0) be an ntal guess for θ.

3 Recall 2/16 We are gven ndependent observatons (x (), z () ) wth mssng values z (). Let θ (0) be an ntal guess for θ. Gven the current estmate θ () of θ, compute Q(θ θ () ) := E z x;θ () log p(x, z; θ) ) = E z () x () ;θ (log p(x (), z () ; θ) () (E step)

4 Recall 2/16 We are gven ndependent observatons (x (), z () ) wth mssng values z (). Let θ (0) be an ntal guess for θ. Gven the current estmate θ () of θ, compute Q(θ θ () ) := E z x;θ () log p(x, z; θ) ) = E z () x () ;θ (log p(x (), z () ; θ) () (E step) (In other words, we average the mssng values accordng to ther dstrbuton after observng the observed values.)

5 Recall 2/16 We are gven ndependent observatons (x (), z () ) wth mssng values z (). Let θ (0) be an ntal guess for θ. Gven the current estmate θ () of θ, compute Q(θ θ () ) := E z x;θ () log p(x, z; θ) ) = E z () x () ;θ (log p(x (), z () ; θ) () (E step) (In other words, we average the mssng values accordng to ther dstrbuton after observng the observed values.) We then optmze Q(θ θ () ) wth respect to θ: θ (+1) := argmax Q(θ θ () ) θ (M step). Theorem: The sequence θ () constructed by the EM algorthm satses: l(θ (+1) ) l(θ () ).

6 Convergence of the EM algorthm - Jensen's nequalty 3/16 Recall: f φ : R R s convex and X s a random varable, then φ(e(x)) E(φ(X)).

7 Convergence of the EM algorthm - Jensen's nequalty 3/16 Recall: f φ : R R s convex and X s a random varable, then φ(e(x)) E(φ(X)). In other words, f µ s a probablty measure on Ω, g : Ω R, and φ : Ω R s convex, then ( ) φ g dµ φ g dµ. Ω Ω

8 Convergence of the EM algorthm - Jensen's nequalty 3/16 Recall: f φ : R R s convex and X s a random varable, then φ(e(x)) E(φ(X)). In other words, f µ s a probablty measure on Ω, g : Ω R, and φ : Ω R s convex, then ( ) φ g dµ φ g dµ. Ω Ω Note: 1 The nequalty s reversed f φ s concave nstead of convex. 2 Equalty holds g s constant or φ(x) = ax + b.

9 Convergence of the EM algorthm - Jensen's nequalty 3/16 Recall: f φ : R R s convex and X s a random varable, then φ(e(x)) E(φ(X)). In other words, f µ s a probablty measure on Ω, g : Ω R, and φ : Ω R s convex, then ( ) φ g dµ φ g dµ. Ω Ω Note: 1 The nequalty s reversed f φ s concave nstead of convex. 2 Equalty holds g s constant or φ(x) = ax + b. Prevously, to deal wth mssng values, our goal was to maxmze log p(x () ; θ) = log p(x (), z () ; θ). z ()

10 Convergence of the EM algorthm - Jensen's nequalty Recall: f φ : R R s convex and X s a random varable, then φ(e(x)) E(φ(X)). In other words, f µ s a probablty measure on Ω, g : Ω R, and φ : Ω R s convex, then ( ) φ g dµ φ g dµ. Ω Ω Note: 1 The nequalty s reversed f φ s concave nstead of convex. 2 Equalty holds g s constant or φ(x) = ax + b. Prevously, to deal wth mssng values, our goal was to maxmze log p(x () ; θ) = log p(x (), z () ; θ). z () Let Q (z) be any probablty dstrbuton for z (),.e., 1 Q (z) 0 2 z Q (z) = 1. 3/16

11 Convergence of the EM algorthm (cont.) 4/16 Then, usng Jensen's nequalty: l(θ () ) = log p(x () ; θ) = = log p(x (), z () ; θ) z () log z () Q (z () ) p(x(), z () ; θ) Q (z () ) z () Q (z () ) log p(x (), z () ; θ) Q (z () )

12 Convergence of the EM algorthm (cont.) Then, usng Jensen's nequalty: l(θ () ) = log p(x () ; θ) = = log p(x (), z () ; θ) z () log z () Q (z () ) p(x(), z () ; θ) Q (z () ) z () Q (z () ) log p(x (), z () ; θ) Q (z () ) Thnkng of the nner sum as an expectaton wth respect to the dstrbuton Q, we have shown: log p(x () ; θ) E z () Q log p(x(), z () ; θ) Q (z (). ) How can we choose Q to get the best lower bound possble? 4/16

13 Convergence of the EM algorthm (cont.) 5/16 At every teraton of the EM algorthm, we choose Q to make the nequalty log p(x () ; θ) E z () Q log p(x(), z () ; θ) Q (z (). ) tght at our current estmate θ = θ ().

14 Convergence of the EM algorthm (cont.) 5/16 At every teraton of the EM algorthm, we choose Q to make the nequalty log p(x () ; θ) E z () Q log p(x(), z () ; θ) Q (z (). ) tght at our current estmate θ = θ (). By the equalty case n Jensen's nequalty, f log p(x () ; θ) = E z () Q log p(x(), z () ; θ) Q (z () ) p(x (), z () ; θ) Q (z () ) = c for all z (). In other words: Q (z () ) p(x (), z () ; θ).

15 Convergence of the EM algorthm (cont.) 5/16 At every teraton of the EM algorthm, we choose Q to make the nequalty log p(x () ; θ) E z () Q log p(x(), z () ; θ) Q (z (). ) tght at our current estmate θ = θ (). By the equalty case n Jensen's nequalty, f log p(x () ; θ) = E z () Q log p(x(), z () ; θ) Q (z () ) p(x (), z () ; θ) Q (z () ) = c for all z (). In other words: Q (z () ) p(x (), z () ; θ). Now, for Q to be a probablty dstrbuton, we need to choose: Q (z () ) = p(x (), z () ; θ) z () p(x(), z () ; θ) = p(z() x () ; θ).

16 Convergence of the EM algorthm (cont.) 6/16 The prevous calculaton motvates the E step E z x;θ () log p(x, z; θ) n the EM algorthm. We wll now show that l(θ (+1) ) l(θ () ).

17 Convergence of the EM algorthm (cont.) 6/16 The prevous calculaton motvates the E step E z x;θ () log p(x, z; θ) n the EM algorthm. We wll now show that l(θ (+1) ) l(θ () ). Wth our choce of Q (t) (z () ) p(x (), z () ; θ (t) ) at step t, we have: l(θ (t) ) = log p(x () ; θ (t) ) = log p(x (), z () ; θ (t) ) z () = = log z () Q (t) z () Q (t) (z () ) p(x(), z () ; θ (t) ) Q (t) (z () ) (z () ) log p(x(), z () ; θ (t) ) Q (t). (z () )

18 Convergence of the EM algorthm (cont.) 7/16 Now, l(θ (t+1) ) = Q (t+1) z () Q (t) z () Q (t) z () = l(θ (t) ). (z () ) log p(x(), z () ; θ (t+1) ) Q (t+1) (z () ) (z () ) log p(x(), z () ; θ (t+1) ) Q (t) (z () ) (z () ) log p(x(), z () ; θ (t) ) Q (t) (z () )

19 Convergence of the EM algorthm (cont.) Now, l(θ (t+1) ) = Q (t+1) z () Q (t) z () Q (t) z () = l(θ (t) ). (z () ) log p(x(), z () ; θ (t+1) ) Q (t+1) (z () ) (z () ) log p(x(), z () ; θ (t+1) ) Q (t) (z () ) (z () ) log p(x(), z () ; θ (t) ) Q (t) (z () ) Frst nequalty holds by Jensen's nequalty (our choce of Q gves equalty n Jensen, but the nequalty holds for any probablty dstrbuton). The second nequalty holds by denton of θ (t+1) : θ (+1) := argmax θ Q (t) z () (z () ) log p(x(), z () ; θ) Q (t). (z () ) 7/16

20 Example - Unvarate Gaussan 8/16 We consder a smple example to llustrate the EM algorthm. Suppose W N(µ, σ 2 ) wth µ R and σ > 0. Suppose w was observed for = 1,..., m and w s mssng for = m + 1,..., n. Let f(x) = 1 2πσ e (x µ)2 2σ 2 be the Gaussan densty.

21 Example - Unvarate Gaussan 8/16 We consder a smple example to llustrate the EM algorthm. Suppose W N(µ, σ 2 ) wth µ R and σ > 0. Suppose w was observed for = 1,..., m and w s mssng for = m + 1,..., n. Let f(x) = 1 2πσ e (x µ)2 2σ 2 be the Gaussan densty. The lkelhood functon for θ = (µ, σ 2 ) s gven by n n 1 L(θ) = f(w ) = e (w µ)2 2σ 2 2πσ = m 1 2πσ e (w µ)2 2σ n =m+1 1 e (w µ)2 2σ 2 2πσ

22 Example - Unvarate Gaussan 8/16 We consder a smple example to llustrate the EM algorthm. Suppose W N(µ, σ 2 ) wth µ R and σ > 0. Suppose w was observed for = 1,..., m and w s mssng for = m + 1,..., n. Let f(x) = 1 2πσ e (x µ)2 2σ 2 be the Gaussan densty. The lkelhood functon for θ = (µ, σ 2 ) s gven by n n 1 L(θ) = f(w ) = e (w µ)2 2σ 2 2πσ = m 1 2πσ e (w µ)2 2σ n =m+1 1 e (w µ)2 2σ 2 2πσ Margnalzng over the unobserved values, we get: m 1... L(θ) dw m+1... dw n = e (w µ)2 2σ. 2πσ

23 Example - Unvarate Gaussan (cont.) 9/16 Concluson: The MLE for (µ, σ 2 ) s the usual MLE for the observed values: ˆµ = 1 m w, ˆσ 2 = 1 m w 2 ˆµ 2. m m We wll now re-derve the same result usng the EM algorthm.

24 Example - Unvarate Gaussan (cont.) 9/16 Concluson: The MLE for (µ, σ 2 ) s the usual MLE for the observed values: ˆµ = 1 m w, ˆσ 2 = 1 m w 2 ˆµ 2. m m We wll now re-derve the same result usng the EM algorthm. The log-lkelhood functon s: l(θ) = [ 1 2 log σ2 1 2σ 2 (w µ) = n 2 log σ2 n 2 log 2π 1 2σ 2 [ nµ 2 + ] log 2π w 2 2µ Remark: The lkelhood s lnear n n w and n w2. ] w

25 Example - Unvarate Gaussan (cont.) 10/16 The E step of the EM algorthm at step t calculates: m E( w w obs ; θ (t) ) = w + (n m)µ (t). E( w 2 w obs ; θ (t) ) = m w 2 + (n m)[(µ (t) ) 2 + (σ 2 ) (t) ]. Note: Replacng n w and n w2 n l(θ) by the above expressons, the resultng functon has the same functonal form as the usual log-lkelhood.

26 Example - Unvarate Gaussan (cont.) 10/16 The E step of the EM algorthm at step t calculates: m E( w w obs ; θ (t) ) = w + (n m)µ (t). E( w 2 w obs ; θ (t) ) = m w 2 + (n m)[(µ (t) ) 2 + (σ 2 ) (t) ]. Note: Replacng n w and n w2 n l(θ) by the above expressons, the resultng functon has the same functonal form as the usual log-lkelhood. We conclude that µ (t+1) = 1 n (ˆσ 2 ) (t+1) = 1 n m w + m (n m) µ (t), n w 2 + n m n (µ(t) ) 2 + (σ 2 ) (t) ) (µ (t+1) ) 2.

27 Example - Unvarate Gaussan (cont.) 11/16 Usually, one would terate the followng system untl convergence: µ (t+1) = 1 m (n m) w + µ (t), n n (ˆσ 2 ) (t+1) = 1 m w 2 + n m n n (µ(t) ) 2 + (σ 2 ) (t) ) (µ (t+1) ) 2.

28 Example - Unvarate Gaussan (cont.) Usually, one would terate the followng system untl convergence: µ (t+1) = 1 m (n m) w + µ (t), n n (ˆσ 2 ) (t+1) = 1 m w 2 + n m n n (µ(t) ) 2 + (σ 2 ) (t) ) (µ (t+1) ) 2. In ths smple case, we can drectly compute the lmt by lettng t and solvng: ˆµ = 1 m (n m) w + ˆµ ˆµ = 1 m w, n n m and ˆσ 2 = 1 n m w 2 + n m n (ˆµ2 + ˆσ 2 ) ˆµ 2 ˆσ 2 = 1 m We obtan the same result as n the drect approach. m w 2 ˆµ 2. 11/16

29 Example - summary 12/16 Of course, one would not use the EM algorthm n the unvarate Gaussan case. The mportant pont here s that 1 The E step was equvalent to computng the condtonal expectaton of the sucent statstcs. 2 The M step was equvalent to a MLE problem wth complete data (often avalable n closed form).

30 Example - summary Of course, one would not use the EM algorthm n the unvarate Gaussan case. The mportant pont here s that 1 The E step was equvalent to computng the condtonal expectaton of the sucent statstcs. 2 The M step was equvalent to a MLE problem wth complete data (often avalable n closed form). The same phenomenon occurs when workng wth exponental famly dstrbutons: f(y θ) = b(y) exp(θ T T (x) a(θ)), where θ s a vector of parameters; T (y) s a vector of sucent statstcs; a(θ) s a normalzaton constant (the log partton functon). Includes: Gaussan, Bernoull, bnomal, multnomal, geometrc, exponental, Posson, Drchlet, gamma, ch-square, etc.. 12/16

31 Example - ttng mxture models 13/16 The EM algorthm s also useful to ttng models where there s no mssng data, but where some hdden parameters make the estmaton dcult. A mxture model s a probablty model wth densty f(x) = K p f (x), where p 0, K p = 1, and each f s a pdf.

32 Example - ttng mxture models 13/16 The EM algorthm s also useful to ttng models where there s no mssng data, but where some hdden parameters make the estmaton dcult. A mxture model s a probablty model wth densty f(x) = K p f (x), where p 0, K p = 1, and each f s a pdf. To sample from such a model: 1 Choose a category C at random accordng to the dstrbuton {p } K. 2 Choose X C = j f j. The f are often taken from the same parametrc famly (e.g. Gaussan), but don't have to.

33 Example - mxture of Gaussans 14/16 Consder a mxture of p-dmensonal Gaussan dstrbutons wth parameters (µ, Σ ) K, mxng probabltes (p ) K [0, 1], K p = 1.

34 Example - mxture of Gaussans 14/16 Consder a mxture of p-dmensonal Gaussan dstrbutons wth parameters (µ, Σ ) K, mxng probabltes (p ) K [0, 1], K p = 1. Consder a sample (x ) n Rp from ths model. The category from whch each sample was obtaned s unobserved. The parameters of the model are θ := {µ, Σ, p : = 1,..., K}. The densty for that model s K f(x) = p φ(x; µ, Σ ), where φ(x; µ, Σ) denotes the Gaussan densty wth parameters (µ, Σ).

35 Example - mxture of Gaussans Consder a mxture of p-dmensonal Gaussan dstrbutons wth parameters (µ, Σ ) K, mxng probabltes (p ) K [0, 1], K p = 1. Consder a sample (x ) n Rp from ths model. The category from whch each sample was obtaned s unobserved. The parameters of the model are θ := {µ, Σ, p : = 1,..., K}. The densty for that model s K f(x) = p φ(x; µ, Σ ), where φ(x; µ, Σ) denotes the Gaussan densty wth parameters (µ, Σ). The log-lkelhood functon s K l(θ) = log p j φ(x ; µ j, Σ j ) j=1 Numercally optmzng l(θ) s known to be slow and unstable. 14/16

36 15/16 Example - mxture of Gaussans (cont.) The EM algorthm approach s smpler and faster. Suppose our observatons are (x, c ) where c s the (unobserved) category from whch x was drawn. The log-lkelhood functon can be wrtten as l(θ) = K log 1 {C =j}p j φ(x ; µ j, Σ j ). j=1 Usng Bayes' rule: π j := P (C = j X = x ) = = P (X = x C = j)p (C = j) K k=1 P (X = x ) C = k)p (C = k) p j φ(x ; µ j, Σ j ) K k=1 p kφ(x ; µ k, Σ k ).

37 Example - mxture of Gaussans (cont.) The EM algorthm for a mxture of Gaussans: E step: Compute the membershp probabltes (or responsabltes')' usng the current estmate of the parameters: π (t) j = M step: Update parameters: µ (t+1) j = 1 π (t) j N x j where Σ (t+1) j p (t+1) j = 1 N j = N k n, p (t) j φ(x ; µ (t) j, Σ(t) j ) K k=1 p(t) k φ(x ; µ (t) k, Σ(t) k ). π (t) j (x µ (t+1) )(x µ (t+1) ) T N k = π (t) k. 16/16

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X The EM Algorthm (Dempster, Lard, Rubn 1977 The mssng data or ncomplete data settng: An Observed Data Lkelhood (ODL that s a mxture or ntegral of Complete Data Lkelhoods (CDL. (1a ODL(;Y = [Y;] = [Y,][

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Semi-Supervised Learning

Semi-Supervised Learning Sem-Supervsed Learnng Consder the problem of Prepostonal Phrase Attachment. Buy car wth money ; buy car wth wheel There are several ways to generate features. Gven the lmted representaton, we can assume

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

The Basic Idea of EM

The Basic Idea of EM The Basc Idea of EM Janxn Wu LAMDA Group Natonal Key Lab for Novel Software Technology Nanjng Unversty, Chna wujx2001@gmal.com June 7, 2017 Contents 1 Introducton 1 2 GMM: A workng example 2 2.1 Gaussan

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statstcal Methods: Beyond Lnear Regresson John R. Stevens Utah State Unversty Notes 2. Statstcal Methods I Mathematcs Educators Workshop 28 March 2009 1 http://www.stat.usu.edu/~rstevens/pcm 2

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

9 : Learning Partially Observed GM : EM Algorithm

9 : Learning Partially Observed GM : EM Algorithm 10-708: Probablstc Graphcal Models 10-708, Sprng 2012 9 : Learnng Partally Observed GM : EM Algorthm Lecturer: Erc P. Xng Scrbes: Mrnmaya Sachan, Phan Gadde, Vswanathan Srpradha 1 Introducton So far n

More information

Integrals and Invariants of Euler-Lagrange Equations

Integrals and Invariants of Euler-Lagrange Equations Lecture 16 Integrals and Invarants of Euler-Lagrange Equatons ME 256 at the Indan Insttute of Scence, Bengaluru Varatonal Methods and Structural Optmzaton G. K. Ananthasuresh Professor, Mechancal Engneerng,

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

Gaussian process classification: a message-passing viewpoint

Gaussian process classification: a message-passing viewpoint Gaussan process classfcaton: a message-passng vewpont Flpe Rodrgues fmpr@de.uc.pt November 014 Abstract The goal of ths short paper s to provde a message-passng vewpont of the Expectaton Propagaton EP

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

The Expectation-Maximisation Algorithm

The Expectation-Maximisation Algorithm Chapter 4 The Expectaton-Maxmsaton Algorthm 4. The EM algorthm - a method for maxmsng the lkelhood Let us suppose that we observe Y {Y } n. The jont densty of Y s f(y ; θ 0), and θ 0 s an unknown parameter.

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Lecture 10: Expectation-Maximization Algorithm

Lecture 10: Expectation-Maximization Algorithm ECE 645: Estmaton Theory Sprng 2015 Instructor: Prof Stanley H Chan Lecture 10: Expectaton-Maxmzaton Algorthm (LaTeX prepared by Shaobo Fang) May 4, 2015 Ths lecture note s based on ECE 645 (Sprng 2015)

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows: Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton

More information

Robust mixture modeling using multivariate skew t distributions

Robust mixture modeling using multivariate skew t distributions Robust mxture modelng usng multvarate skew t dstrbutons Tsung-I Ln Department of Appled Mathematcs and Insttute of Statstcs Natonal Chung Hsng Unversty, Tawan August, 1 T.I. Ln (NCHU Natonal Chung Hsng

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Stat 543 Exam 2 Spring 2016

Stat 543 Exam 2 Spring 2016 Stat 543 Exam 2 Sprng 206 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of questons. Do at least 0 of the parts of the man exam. I wll score

More information

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition

Overview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition Overvew Hdden Marov Models and Gaussan Mxture Models Steve Renals and Peter Bell Automatc Speech Recognton ASR Lectures &5 8/3 January 3 HMMs and GMMs Key models and algorthms for HMM acoustc models Gaussans

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Probability and Random Variable Primer

Probability and Random Variable Primer B. Maddah ENMG 622 Smulaton 2/22/ Probablty and Random Varable Prmer Sample space and Events Suppose that an eperment wth an uncertan outcome s performed (e.g., rollng a de). Whle the outcome of the eperment

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Stat 543 Exam 2 Spring 2016

Stat 543 Exam 2 Spring 2016 Stat 543 Exam 2 Sprng 2016 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of 11 questons. Do at least 10 of the 11 parts of the man exam.

More information

Mixture o f of Gaussian Gaussian clustering Nov

Mixture o f of Gaussian Gaussian clustering Nov Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

MATH 281A: Homework #6

MATH 281A: Homework #6 MATH 28A: Homework #6 Jongha Ryu Due date: November 8, 206 Problem. (Problem 2..2. Soluton. If X,..., X n Bern(p, then T = X s a complete suffcent statstc. Our target s g(p = p, and the nave guess suggested

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering Statstcs and Probablty Theory n Cvl, Surveyng and Envronmental Engneerng Pro. Dr. Mchael Havbro Faber ETH Zurch, Swtzerland Contents o Todays Lecture Overvew o Uncertanty Modelng Random Varables - propertes

More information

1 Motivation and Introduction

1 Motivation and Introduction Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,

More information

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0 Bézer curves Mchael S. Floater September 1, 215 These notes provde an ntroducton to Bézer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of

More information

Comments on Detecting Outliers in Gamma Distribution by M. Jabbari Nooghabi et al. (2010)

Comments on Detecting Outliers in Gamma Distribution by M. Jabbari Nooghabi et al. (2010) Comments on Detectng Outlers n Gamma Dstrbuton by M. Jabbar Nooghab et al. (21) M. Magdalena Lucn Alejandro C. Frery September 17, 215 arxv:159.55v1 [stat.co] 16 Sep 215 Ths note shows that the results

More information

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise Dustn Lennon Math 582 Convex Optmzaton Problems from Boy, Chapter 7 Problem 7.1 Solve the MLE problem when the nose s exponentally strbute wth ensty p(z = 1 a e z/a 1(z 0 The MLE s gven by the followng:

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Applied Stochastic Processes

Applied Stochastic Processes STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of

More information

Appendix B. Criterion of Riemann-Stieltjes Integrability

Appendix B. Criterion of Riemann-Stieltjes Integrability Appendx B. Crteron of Remann-Steltes Integrablty Ths note s complementary to [R, Ch. 6] and [T, Sec. 3.5]. The man result of ths note s Theorem B.3, whch provdes the necessary and suffcent condtons for

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Random Partitions of Samples

Random Partitions of Samples Random Parttons of Samples Klaus Th. Hess Insttut für Mathematsche Stochastk Technsche Unverstät Dresden Abstract In the present paper we construct a decomposton of a sample nto a fnte number of subsamples

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Lecture 4: September 12

Lecture 4: September 12 36-755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been

More information

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the well-known concentraton nequalty, so the covarance estmator can be bounded Then we

More information

Beyond Zudilin s Conjectured q-analog of Schmidt s problem

Beyond Zudilin s Conjectured q-analog of Schmidt s problem Beyond Zudln s Conectured q-analog of Schmdt s problem Thotsaporn Ae Thanatpanonda thotsaporn@gmalcom Mathematcs Subect Classfcaton: 11B65 33B99 Abstract Usng the methodology of (rgorous expermental mathematcs

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Multi-Conditional Learning for Joint Probability Models with Latent Variables

Multi-Conditional Learning for Joint Probability Models with Latent Variables Mult-Condtonal Learnng for Jont Probablty Models wth Latent Varables Chrs Pal, Xueru Wang, Mchael Kelm and Andrew McCallum Department of Computer Scence Unversty of Massachusetts Amherst Amherst, MA USA

More information

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems Chapter. Ordnar Dfferental Equaton Boundar Value (BV) Problems In ths chapter we wll learn how to solve ODE boundar value problem. BV ODE s usuall gven wth x beng the ndependent space varable. p( x) q(

More information

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan. THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY Wllam A. Pearlman 2002 References: S. Armoto - IEEE Trans. Inform. Thy., Jan. 1972 R. Blahut - IEEE Trans. Inform. Thy., July 1972 Recall

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

A new Approach for Solving Linear Ordinary Differential Equations

A new Approach for Solving Linear Ordinary Differential Equations , ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information