Mean Field / Variational Approximations

Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods

Introducton roblem: We have dstrbuton but nference s hard to compute. revous solutons: Appromate energy functonal: Bethe, Kkuch Introducton New dea: Drectly optmze the energy functonal ntroducng a dstrbuton defned on the same doman of varables as whch ncorporates some constrants. Obectve: We want to fnd whch s the best appromaton of and use to make nferences. Fnd that mnmzes F,

Mean Feld Appromaton Assumptons: s our mean feld appromaton. Varables n the dstrbuton are ndependent varables. In the standard mean feld approach, s completely factorzed: Mean Feld Appromaton What happens when we apply mean feld?

Mean Feld Appromaton, F F + F F, H H Mean Feld Appromaton Task: fnd mnmzng F, such that F, Solvng: buld a Lagrangan, dfferentate and set to 0! H, F F +

Mean Feld Appromaton The dstrbuton s locally optmal soluton gven,, -, +,, n, f: ep Z F [ ln ] MF-equaton Where Z s a local normalzng constant and [ln ] s the condtonal epectaton gven the value. Mean Feld Appromaton Localty: Only local operatons are needed for teraton of the MF-equatons. In other words, only neghborng varables are needed. ep Z : Scope [ ] [ ln U, ] MF-equaton smplfed where U Scope [] Calculaton of depends only on clusters belongs to

Mean Feld Appromaton Soluton: Iterate mean feld equatons Converge to a fed pont. roblem: convergence to a local optma. [ ] ] [ :, ln ep Scope U Z MF-equaton smplfed Mean Feld Appromaton Haft et al. paper: Optmze the KL dvergence nstead of the free energy D D D + D + Assume:

Mean Feld Appromaton Haft et al. paper: Optmze the KL dvergence nstead of the free energy D + Does not depend on D Subect to Depends on Mean Feld Appromaton Haft et al. paper: ep ep MF-equaton Localty: ep M M MF-equaton smplfed where M s the Markov boundary.

Mean Feld Appromaton Algorthm: Mean Feld Appromaton Converges to one of typcally many local mnma. asy to compute but sometmes s not good enough. It cannot descrbe comple posterors eg. OR We must use a rcher class of dstrbutons.

Structured Mean Feld plotng Substructures If we use a dstrbuton that can capture some of the dependences n, we can get a better appromaton. Two possble substructures for A, A,2 A,3 A,4 A, A,2 A,3 A,4 A, A,2 A,3 A,4 Independent chans A 4, Dstrbuton A 4, Dstrbuton A 4, Dstrbuton 2 Structured Mean Feld plotng Substructures Z ψ where ψ s a factor wth Scope[ψ ] C. and assume we have the set of potental scopes: {C χ:,,j}

Structured Mean Feld plotng Substructures Gven: Z ψ And restrcton: ψ c c Then the potental ψ s locally optmal when: [ ln F c ] [ ln k c ] ψ c ep ψ k Structured Mean Feld plotng Substructures Localty as Mean Felds: where and ψ c ep A ψ k B A B { F : U C } { ψ k : C k C } [ c ] [ lnψ k c ]

Structured Mean Feld Updatng: Calculaton of depends on clusters where belongs to. And on clusters overlappng C n. And on scopes C k dependent of C n also. C A, A,2 A,3 A,4 A, A,2 A,3 A,4 A 4, Dstrbuton A 4, Dstrbuton Structured Mean Feld In other words, we want to compute A, : C {A, A,2 } C 2 {A,2 A,3 } C 3 { A 2,2 }. A Clusters belongs to as standard mean feld.e. {A, A,2 } and {A, } Clusters overlappng C and those from F. For eample n ths case A,2 n C overlaps n F, thus we need to consder {A,2 A,3 } and {A, A 2,2 }. The same occurs wth A,3 and A,4 B Clusters n dependent on C. In ths eample every C s ndependent from each other, therefore B s empty. C A, A,2 A,3 A,4 A, A,2 A,3 A,4 A 4, Dstrbuton A 4, Dstrbuton

Structured Mean Feld Agan we want to compute A,, assume the new substructure n : Now we choose C {A, A,2 A,3 A,4 } C 2 { A 2,2 A 2,3 A 2,4 } A We consder the same clusters as before but now we add those overlappng wth,.e. { } and { A 2,2 } B Clusters n dependent on C. Now we have n C overlappng wth n C 2. We need to subtract snce we already used t n A. A, A,2 A,3 A,4 A, A,2 A,3 A,4 C C 2 A 4, Dstrbuton A 4, Dstrbuton Structured Mean Feld Another eample, we want to compute a,b: Now we choose C {A B} C 2 {C D} A { {A B} {A D} {B C}} B mpty, snce C and C 2 do not overlap. C A A B D B D C 2 C C Dstrbuton Dstrbuton

Structured Mean Feld plotng Substructures Updates are relatvely costly due to the consderaton of structure. Two approaches for updates: Sequental: Choose a factor and update t, then perform nference. It wll converge. arallel: Update all factors, then nference. It doesn t guarantee convergence. Structured Mean Feld ample: Structure b can be eploted: A, B, C, D ψ A, B ψ 2 C, D Z ' A, B, C, D AB A, B CD C, D ψ ' A ψ '' B ψ ' 2 C ψ '' 2 D Z Structure c cannot be eploted redundant

Structured Mean Feld Refnement Theorem: Refnes an ntal appromatng network by factorzng ts factors nto a product of factors and potentals from F. ψ k can be wrtten as the product of two sets of factors: Those n F that are subsets of the scope of ψ k. artally covered factors n F by the scope of ψ and other factors n. Wegthed Mean Feld General Mture Weghts Idea: Instead of selectng one partcular MF soluton, we form a weghted average a mture of several solutons. numerate the dfferent MF-solutons by a hdden varable a, a. Assgn mture weghts a. a a a

Wegthed Mean Feld Gven a a under the constrant a a a Determne a such that D s mnmzed: a ep a a ep [ D a ] Wegthed Mean Feld General Mture Weghts The prevous formula means that dfferent solutons a contrbute to the global dstrbuton accordng to ther dstance to. Note however, we are not optmzng a smultaneously.

Wegthed Mean Feld ample: Nosy-OR Varatonal Methods Idea: Introducng aulary varatonal parameters that help n smplfyng a comple obectve functon. ln λ - lnλ - Ths upper bound allows to appromate ln wth a term that s lnear n.

Thank you! Mean Feld Appromaton ample from Wegernck: Nosy-OR from Haft et al.: