Outline Graphical Models ML 701 nna Goldenberg! ynamic Models! Gaussian Linear Models! Kalman Filter! N! Undirected Models! Unification! Summary HMMs HMM in short! is a ayes Net hidden states! satisfies Markov property (independence of states given present)! with discrete states (time steps are discrete) observations T 1 P (Q, O) = p(q 0 ) p(q t+1 q t ) t=1 T p(o t q t ) t=1 What about continuous HMMs?
What about continuous HMMs? xample of use SLM - Simultaneous Localization and Mapping http://www.stanford.edu/~paskin/slam/ Gaussian Linear State Space models!!! rawback: elief State and Time grow quadratically in the number of landmarks State Space Models State Space Models hidden states hidden states observations observations T 1 P (Q, O) = p(q 0 ) p(q t+1 q t ) t=1 T p(o t q t ) t=1 - is a real-valued K-dimensional hidden state variable - is a -dimensional real-valued observation vector
-1-1 -1-1 -1 State Space Models Gaussian Linear State Space Models hidden states observations q t = f(q t 1 ) + w t O t = g(q t ) + v t f determines mean of given mean of -1 is zero-mean random noise vector wt similarly! and are Gaussian! f and g are linear and time-invariant q t = q t 1 + w t, w t N(0, R) O t = q t 1 + v t, v t N(0, S) q 0 N(0, Σ 0 ) - transition matrix - observation matrix correction: previously R and S were reversed! forward step (filtering)! backward step (smoothing) Inference p(q t O 0,, O t ) p(q t O t, O t+1,, O T ) Kalman Filter! time update Kalman Filter (1960) (q t t 1 ) = (q t 1 t 1 ) V (q t t 1 ) = V (q t 1 t 1 ) T + R! measurement update P (q t 1 O 0,, O t 1 ) P (q t O 0,, O t 1 ) P (q t O o,, O t 1 ) P (q t O o,, O t ) 1. P (q t, O t O o,, O t 1 ) [ ] (q t t 1 ) [ Σ 11 Σ 12 V ( t 1) V (q ] t t 1)T (q t t 1 ) V (q t t 1 ) V (q t t 1 ) T + R Σ 21 Σ 22 2. P (q t O o,, O t 1 ) P (q t O o,, O t ) (q t t ) = (q t t 1 ) + Σ 12 Σ 1 22 (O t (O t t )) V (q t t ) = V (q t t 1 ) Σ 12 Σ 1 22 Σ 21-1
xample of use Kalman Filter Usage! Tracking motion! Missiles! Hand motion! Lip motion from videos! Signal Processing! Navigation! conomics (for prediction) Reported by Welch and ishop, SIGGRPH 2001 ynamic ayes Nets ynamic ayes Nets Weather 0 Weather 1 Weather 2! So far Velocity 0 Velocity 1 Velocity 2 Location 0 Location 1 Location 2 Failure 0 Failure 1 Failure 2! ut are there more appealing models? Obs_0 Obs_1 Obs_2 Weather 0 Weather 1 Weather 2! It s just a ayes Net! Velocity 0 Velocity 1 Velocity 2! pproach to the dynamics Location 0 Location 1 Location 2! 1. Start with some prior for the initial state (Koller and Friedman)! 2. Predict the next state just using the observation up to the previous time step Failure 0 Failure 1 Failure 2! 3. Incorporate the new observation and re-estimate the current state Obs_0 Obs_1 Obs_2
ynamic ayes Nets her graphical models Weather 0 Weather 1 Weather 2 Velocity 0 Velocity 1 Velocity 2 Location 0 Location 1 Location 2 but first... Failure 0 Failure 1 Failure 2! It s just a ayes Net! Obs_0 Obs_1 Obs_2 ny questions so far?! pproach to the dynamics Most importantly: Use the structure of the ayes Net. Use the independencies!!!! 1. Start with some prior for the initial state! 2. Predict the next state just using the observation up to the previous time step! 3. Incorporate the new observation and re-estimate the current state re all GM directed? There are Undirected Graphical Models! Undirected models p(x) = 1 ψ(x ) Z ψ(x ) - non-negative potential function What are?
liques liques p(x) = 1 ψ(x ) Z ψ(x ) - non-negative potential function clique is a subset V if i,j, (i,j) is maximal if it is not contained in any other clique i) - a clique? ii) - a maximal clique? iii) - a clique? iv) - a maximal clique? v) - a clique? ecomposition Independence Rule: V1 is independent of V2 given cutset S S is called the Markov lanket (M) e.g. M() = {,,}, i.e. the set of neighbors Note to resolve the confusion: The most common machine learning notation is the decomposition over maximal cliques p(,,,, ) = 1 p(,, )p(, )p(, )p(, ) Z
re undirected models useful? re undirected models useful?! Yes!! Used a lot in Physics (Ising model, oltzmann machine)! In vision (every pixel is a node)! ioinformatics! Yes!! Used a lot in Physics (Ising model, oltzmann machine)! In vision (every pixel is a node), bioinformatics! Why not more popular?! the ZZZZZZ! it s the partition function p(x) = 1 ψ(x ) Z What s Z and ways to fight it Z = x ψ(x )! pproximations! Sampling (MM sampling is common) hain Graphs! Generalization of MRFs and ayes Nets! Structured as blocks! Undirected edges within a block! irected edges between blocks! Pseudo-Likelihood! Mean-field approximation
hain Graphs! Generalization of MRFs and ayes Nets Graphical Models hain Graphs! Structured as blocks! Undirected edges within a block! irected edges between blocks quite intractable not very popular used in iomedical ngineering (text) Undirected? irected irected Undirected? Undirected? irected?
hain Graphs Summary Undirected irected! Graphical Models is a huge evolving field! There are many other variations that haven t been discussed! Used extensively in variety of domains! Tractability issues! More work to be done! Questions?