Lecture 14 : Stochastic Variation Inference

Size: px

Start display at page:

Download "Lecture 14 : Stochastic Variation Inference"

Osborne Floyd
5 years ago
Views:

1 Lecture 14 : Stochastic Variation Inference I Scribes : Dat Jabien Ting

2 Recap : Vaniatianal Inference foal : Approximate Posterior Generative Model : plx z 13 ) pcx 1713 ) PHIPIB ) Variational Dist : q( 7:10 ) 9113 :D ) I p( Objective : Evidence Lower Bound ( ELBO ) L ( 1019 ) Etqlz PCX 713 ) :q)q( P :D ) leg got ;q)q( Bid ) log pcx ) KL( : 9 ) pct pl ) )

3 Latent Diniohlet Allocation : Generative Model Generative model : ~ Pu Dirichktlw ) xdn Bh dnl7dnh ~ Oa Dirichlet ( x K ) 2 dn~ Discrete ( Od ) on A Zan 9 d ~ Discrete ( Pa ) Nd D pcx 713 ) ( th pcxalza p ) Peta ) ) (thpl But) :O p(7a ) dod p(7a19d ) pl Odin

4 LDA Global vs Local Parameters Generative Model : plx z 13 ) pc Variational Approx : qc 7 ; ) qlp :D ) I l local 1713 ) ph )p( 137 pc global ( p( x 7 p ) p( Xdltd p)p(7d ) ) pl/3u)ql7:q)qlp:7)(jpfqc7aioa ) ) th 91Pa :%) local Variation ( only depend on doc d ) to purams global ( variational ) purms

5 Stochastic Variational Inference Problem : Wikipedia has 5Mt entries If we wanted to do VBEM then we need updates 7 anggnax 141 ) a Now each of these updates over the 5Mt documents argonanax LC old d D ) a requires full pass Solution ; Optimize I with stochastic gradient descent

6 Stochastic Gradient Ascent Idea : Optimize objective with estimate noisy of gradient gradient " It get 19 + ) Approximation Step sire J the of EHn ( g) ) VILLA ) ( Intuition : Random walk with drift in direction Requirement 2 Robbins Monro conditions oo It Descent Requirement I : Gradient estimate should be unbiased otsmaimt ) ft a ( Infinite ft ' < a ( Finite variance ) mean displacement I

7 Intuition : Choosing the Step She : Choose step size Intuition L 7 g inversely proportional t+gh" af amy#iionifiiitioi xt " Hi ; Hessian dxidxj

8 Stochastic Variation al Inference Approximation : Compute ELBO for batch of docs 17 ) mqax ) ( moya pc # &Jq( qa ; q( p ;D z # x 7137 ; )q( is ;D ) la?ifh " ] an ; a p lbspyy + is # up legfi l )

9 Stochastic Variation al Inference Approximation : Compute ELBO for batch of does ' ' ' D? ) by P mm { "gyiil!y KL(q( p :D ) H ( # my :# an Choose batch of does : b ~ D Unif( p ( p ) ) { 1 B iq#osb2paabmybi ( #aa my sfs ; b b p ( p ) ) 5[ I 19 ) ( KL(q( p :D ) H D } )

10 ' stl Stochastic Vaniatimal Inference TL ; DR ; For conditionally conjugate exponential family models we can compute a natural tlu ) ng( x 74 ( Eq ; [ ygk D 74 ) ] 4 ~ f PCO ;a7 I gradient Natural param "+gt LhY + 2 D tlxa ta ) x ) + de ' \ sufficient Stats This yields gradient updates It It It 'll 's ftpnlnslst

11 X Natural Gradients : Coordinate Transformations Llx Example : Suppose we have two equivalent sets at coordinates (X Xz ) and ( tkz ) Iz Xz Iz d ( LCI Il F) + x X ~ ~ X [ ( xt+ei )

12 ( o Gradient Values Change with Coordinates 4 I 4I 6 off L I I ' die Fx / 62 gz I ± Llx xd i+ ) LCI Ii kite ) 2 2g ZI % OX 8 ZXZ Iz Fe 62 dl dxi Fx

13 Coordinate Invariant Gradients Idea : Enforce requirement that the rate of change must old at must be invariant d1 d±t( DIT at at at ( OELCII ) LK ) T* i Jt(P L( Jacobian Matrix of partial derivatives : ' J 9 25 ' OF dxnz * a Chain ; Rule a)? 9Ei?5ii :

14 Coordinate Invariant Gradients Assumption : moves along k 7 41 d t day I LK J TELCE ) at TE DI at Tx Llx Lei ) EELKYTJ 's (0 21 1) 'It(be cx ( ) * VINET #Lies )

15 Coordinate Invariant Gradients ndi dx Solution : Define natural gradient at f [ 1 x By L (E) ( JTJJ ' VILII ) task ' " ) J ##P k ask ( a kilt ( ) at Cfe LWT # VILKI ( Fifa ) )t( En La ) TEETH

16 Coordinate Invariant Gradients Distance Metrics : 61 Ii ] I dxtdx dx? + dxzz JTGJ / JTJ E E I DET dettcdx ( JTJ ) DE DIT J : 6 6 a ] Isis de ~ ] E www?ii!infeeii?ki dxtdx 52 DE? t 63 DE } * ' ' '

17 Natural Gradients in Vaniational Inference Distance Metric ; Symmetric KL divergence KL "m( 97 ' ) Eap ; llg9qmp o;) Kllalrsilllqassi 'D + Ftaanoillggtfsfits ) 1<491BSIIHQCB ;D ) ) ddt 6C 7) di KL "m( di ) f 19 ) Gills ) tli/ ) Gt ) Otago )

Lecture 13 : Variational Inference: Mean Field Approximation

10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1