CS 4495 Computer Vision Hidden Markov Models

CS 4495 Compuer Vision Aaron Bobick School of Ineracive Compuing

Adminisrivia PS4 going OK? Please share your experiences on Piazza e.g. discovered somehing ha is suble abou using vl_sif. If you wan o alk abou wha scales worked and why ha s ok oo.

Ouline Time Series Markov Models 3 compuaional problems of HMMs Applying HMMs in vision- Gesure Slides borrowed from UMd and elsewhere Maerial from: slides from Sebasian Thrun, and Yair Weiss

Audio Specrum Audio Specrum of he Song of he Prohonoary Warbler

Bird Sounds Prohonoary Warbler Chesnu-sided Warbler

Quesions One Could Ask Wha bird is his? How will he song coninue? Is his bird sick? Wha phases does his song have? Time series classificaion Time series predicion Oulier deecion Time series segmenaion

Oher Sound Samples

Anoher Time Series Problem Cisco General Elecric Inel Microsof

Quesions One Could Ask Will he sock go up or down? Wha ype sock is his (eg, risky)? Is he behavior abnormal? Time series predicion Time series classificaion Oulier deecion

Music Analysis

Quesions One Could Ask Is his Beehoven or Bach? Can we compose more of ha? Can we segmen he piece ino hemes? Time series classificaion Time series predicion/generaion Time series segmenaion

For vision: Waving, poining, conrolling?

The Real Quesion How do we model hese problems? How do we formulae hese quesions as a inference/learning problems?

Ouline For Today Time Series Markov Models 3 compuaional problems of HMMs Applying HMMs in vision- Gesure Summary

Weaher: A Markov Model (maybe?) 80% Sunny 60% Rainy 20% 5% 38% 75% 5% 2% 5% Snowy Probabiliy of moving o a given sae depends only on he curren sae: s Order Markovian

Ingrediens of a Markov Model Saes: { S, S2,..., S N } Sae ransiion probabiliies: a = Pq ( = S q= S) Iniial sae disribuion: ij + i j π = Pq [ = S] i i 80% Sunny 5% Rainy 60% 38% 5% 2% 75% 5% Snowy 20%

Ingrediens of Our Markov Model Saes: { Ssunny, Srainy, Ssnowy} Sae ransiion probabiliies:.8.5.05 A =.38.6.02.75.05.2 Iniial sae disribuion: π = (.7.25.05) 80% Sunny Rainy 5% 38% 5% 2% 75% 5% Snowy 60% 20%

Probabiliy of a Time Series Given: Wha is he probabiliy of his series? P( S P( S sunny snowy ) P( S S rainy S ) P( S rainy sunny snowy ) P( S S rainy snowy ) S rainy ) P( S rainy S rainy ) = 0.7 0.5 0.6 0.6 0.02 0.2 = 0.00052 A =.8.38.75.5.6.05.05.02.2 π = (.7.25.05)

Ouline For Today Time Series Markov Models 3 compuaional problems of HMMs Applying HMMs in vision- Gesure Summary

80% Sunny 60% 30% NOT OBSERVABLE 80% Sunny 5% 5% Snowy Rainy 5% Rainy 30% 38% 5% 2% 75% 5% 0% 5% Snowy 2% 65% 75% 5% 20% 0% 50% 50% 60% 60% OBSERVABLE 20%

Probabiliy of a Time Series Given: Wha is he probabiliy of his series? P ( O) = P( Ocoa, Ocoa, Oumbrella,..., Oumbrella ) = P( O Q) P( Q) = P( O q,..., q7) P( q,..., q7) all Q q,..., q7 2 4 6 = (0.3 0. 0.6) (0.7 0.8 ) +... A =.8.38.75.5.6.05.05.02.2 π = (.7.25.05) B =.6.05 0.3.3.5..65.5

Specificaion of an HMM N - number of saes Q = {q ; q 2 ; : : : ;q T } sequence of saes Some form of oupu symbols Discree finie vocabulary of symbols of size M. One symbol is emied each ime a sae is visied (or ransiion aken). Coninuous an oupu densiy in some feaure space associaed wih each sae where a oupu is emied wih each visi For a given sequence observaion O O = {o ; o 2 ; : : : ;o T } o i observed symbol or feaure a ime i

Specificaion of an HMM A - he sae ransiion probabiliy marix a ij = P(q + = j q = i) B- observaion probabiliy disribuion Discree: b j (k) = P(o = k q = j) i k M Coninuous b j (x) = p(o = x q = j) π - he iniial sae disribuion π (j) = P(q = j) S S 2 S 3 Full HMM over a of saes and oupu space is hus specified as a riple: λ = (A,B,π)

Wha does his have o do wih Vision? Given some sequence of observaions, wha model generaed hose? Using he previous example: given some observaion sequence of clohing: Is his Philadelphia, Boson or Newark? Noice ha if Boson vs Arizona would no need he sequence!

Ouline For Today Time Series Markov Models 3 compuaional problems of HMMs Applying HMMs in vision- Gesure Summary

The 3 grea problems in HMM modelling. Evaluaion: Given he model λ = (A, B, π) wha is he probabiliy of occurrence of a paricular observaion sequence O = {o,, o T } = P(O λ) This is he hear of he classificaion/recogniion problem: I have a rained model for each of a se of classes, which one would mos likely generae wha I saw. 2. Decoding: Opimal sae sequence o produce an observaion sequence O = {o,, o T } Useful in recogniion problems helps give meaning o saes which is no exacly legal bu ofen done anyway. 3. Learning: Deermine model λ, given a raining se of observaions Find λ, such ha P(O λ) is maximal

Problem : Naïve soluion Sae sequence Q = (q, q T ) Assume independen observaions: T P ( O q, λ) = P( o q, λ) = bq ( o ) bq 2( o2 i= )... b qt ( o T ) NB: Observaions are muually independen, given he hidden saes. Tha is, if I know he saes hen he previous observaions don help me predic new observaion. The saes encode *all* he informaion. Usually only kind-of rue see CRFs.

Problem : Naïve soluion Bu we know he probabiliy of any given sequence of saes: Pq ( λ) = π a a... a q qq2 q2q3 q( T ) qt

Problem : Naïve soluion Given: P ( O q, λ) = P( o q, λ) = bq ( o ) bq 2( o2 We ge: T i= Pq ( λ) = π a a... a q qq2 q2q3 q( T ) qt )... b P ( O λ) = P( O q, λ) P( q λ) q NB: -The above sum is over all sae pahs -There are N T saes pahs, each cosing O(T) calculaions, leading o O(TN T ) ime complexiy. qt ( o T )

Problem : Efficien soluion Define auxiliary forward variable α: ) = (,...,, = α ( i P o o q i λ α (i) is he probabiliy of observing a parial sequence of observables o, o AND a ime, sae q = i )

Problem : Efficien soluion Recursive algorihm: Iniialise: α () i = π b( o ) i i (Parial obs seq o AND sae i a ) x (ransiion o j a +) x (sensor) Calculae: α Obain: N ( j) = α () ia b( o ) + ij j + i= N P O λ) = i= ( α ( i ) T Sum of differen ways of geing obs seq Complexiy is only O(N 2 T)!!! Sum, as can reach j from any preceding sae

CS 4495 Compuer Vision A. Bobick The Forward Algorihm () S 2 S 3 S S 2 S 3 S O 2 O S 2 S 3 S O 3 S 2 S 3 S O 4 S 2 S 3 S O T ),,..., ( ) ( i S q O O P i = = α ) ( ) ( ) ( ), ( ),,..., ( ),,...,,,..., ( ),,..., ( ) ( i a O b i S q S q P O S q O O P S q O O S q O O P S q O O P j ij N i j i j N i N i i i j j α α α + = + + = = + + + + + = = = = = = = = = = ) ( ) ( O b i i α = π i (Trellis diagram)

Problem : Alernaive soluion Backward algorihm: Define auxiliary forward variable β: β ( i) = Po (, o,..., o q = i, λ) + + 2 T β (i) he probabiliy of observing a sequence of observables o +,, o T GIVEN sae q = i a ime, and λ

Problem : Alernaive soluion Recursive algorihm: Iniialize: β ( j) = T Calculae: Terminae: N β () i = β + ( jab ) ( o + ) ij j j= N p( O λ) = β ( i = T,..., i= ) Complexiy is O(N 2 T)

Forward-Backward Opimaliy crierion : o choose he saes individually mos likely a each ime q ha are The probabiliy of being in sae i a ime γ () = pq ( = i O, λ) = i α () i β () i N i= α () i β () i = p(o λ) and q =i = p(o λ) α () i : accouns for parial observaion sequence ( i): accoun for remainder o, o,... o β + + 2 T o, o2,... o

Problem 2: Decoding Choose sae sequence o maximise probabiliy of observaion sequence Vierbi algorihm - inducive algorihm ha keeps he bes sae sequence a each insance S S S 2 S 2 S S 2 S S 2 S S 2 S 3 S 3 S 3 S 3 S 3 O O 2 O 3 O 4 O T

Problem 2: Decoding Vierbi algorihm: Sae sequence o maximize P(O, Q ): Pq (, q,... q Oλ, ) 2 Define auxiliary variable δ: T δ ( i) = max Pq (, q,..., q = io,, o,... o λ) 2 2 q δ (i) he probabiliy of he mos probable pah ending in sae q = i

Problem 2: Decoding Recurren propery: Algorihm:. Iniialise: δ ( j) = max( δ ( ia ) ) b( o ) + ij j + i To ge sae seq, need o keep rack of argumen o maximise his, for each and j. Done via he array ψ (j). δ () i = πb( o) i i i N ψ () i = 0

Problem 2: Decoding 2. Recursion: δ ( j) = max( ( ia ) ) b( o) 3. Terminae: δ ij j i N ψ ( ) arg max( ( ) ) j δ iaij i N = 2 T, j N P q = T = maxδ ( i) i N T arg maxδ i N T ( i) P* gives he sae-opimized probabiliy Q* is he opimal sae sequence (Q = {q, q2,, qt })

Problem 2: Decoding 4. Backrack sae sequence: q ψ q = ( ) = T, T 2,..., + + S S S 2 S 2 S S 2 S S 2 S S 2 S 3 S 3 S 3 S 3 S 3 O O 2 O 3 O 4 O T O(N 2 T) ime complexiy

Problem 3: Learning Training HMM o encode observaion sequence such ha HMM should idenify a similar obs seq in fuure Find λ = (A, B, π), maximizing P(O λ) General algorihm: Iniialize: λ 0 Compue new model λ, using λ 0 and observed sequence O Then λ λ o Repea seps 2 and 3 unil: log P ( O λ) log P( O λ0) < d

CS 4495 Compuer Vision A. Bobick Problem 3: Learning ) ( ) ( ) ( ) ( ), ( λ β α ξ O P j o b a i j i j ij + + = Le ξ(i,j) be a probabiliy of being in sae i a ime and a sae j a ime +, given λ and O seq = = + + + + = N i N j j ij j ij j o b a i j o b a i ) ( ) ( ) ( ) ( ) ( ) ( β α β α Sep of Baum-Welch algorihm: = p(o and (ake i o j) λ ) = p(o λ) = p(ake i o j a ime O,λ)

Problem 3: Learning Operaions required for he compuaion of he join even ha he sysem is in sae Si and ime and Sae Sj a ime +

Problem 3: Learning Le γ () i be a probabiliy of being in sae i a ime, given O T = T = γ () i ξ (, i j) N γ () i = ξ (, i j) j= - expeced no. of ransiions from sae i - expeced no. of ransiions i j

Problem 3: Learning Sep 2 of Baum-Welch algorihm: ˆ π = γ ( i ) he expeced frequency of sae i a ime = aˆ ij = ξ ( i, γ ( i) j) raio of expeced no. of ransiions from sae i o j over expeced no. of ransiions from sae i γ ( j) ˆ o, ( ) = k j γ ( j) b k = raio of expeced no. of imes in sae j observing symbol k over expeced no. of imes in sae j

Problem 3: Learning Baum-Welch algorihm uses he forward and backward algorihms o calculae he auxiliary variables α, β B-W algorihm is a special case of he EM algorihm: E-sep: calculaion of ξ and γ M-sep: ieraive calculaion of πˆ, â ij, bˆ j ( k) Pracical issues: Can ge suck in local maxima Numerical problems log and scaling

Now HMMs and Vision: Gesure Recogniion

"Gesure recogniion"-like aciviies

Some houghs abou gesure There is a conference on Face and Gesure Recogniion so obviously Gesure recogniion is an imporan problem Prooype scenario: Subjec does several examples of "each gesure" Sysem "learns" (or is rained) o have some sor of model for each A run ime compare inpu o known models and pick one New found life for gesure recogniion:

Generic Gesure Recogniion using HMMs Nam, Y., & Wohn, K. (996, July). Recogniion of space-ime hand-gesures using hidden Markov model. In ACM symposium on Virual realiy sofware and echnology (pp. 5-58).

Generic gesure recogniion using HMMs () Daa glove

Generic gesure recogniion using HMMs (2)

Generic gesure recogniion using HMMs (3)

Generic gesure recogniion using HMMs (4)

Generic gesure recogniion using HMMs (5)

Wins and Losses of HMMs in Gesure Good poins abou HMMs: A learning paradigm ha acquires spaial and emporal models and does some amoun of feaure selecion. Recogniion is fas; raining is no so fas bu no oo bad. No so good poins: If you know somehing abou sae definiions, difficul o incorporae Every gesure is a new class, independen of anyhing else you ve learned. ->Paricularly bad for parameerized gesure.

Parameerized Gesure I caugh a fish his big.

Parameric HMMs (PAMI, 999) Basic ideas: Make oupu probabiliies of he sae be a funcion of he parameer of ineres, b j (x) becomes b j(x, θ). Mainain same emporal properies, a ii unchanged. Train wih known parameer values o solve for dependencies of bb on θ. During esing, use EM o find θ ha gives he highes probabiliy. Tha probabiliy is confidence in recogniion; bes θ is he parameer. Issues: How o represen dependence on θ? How o rain given θ? How o es for θ? Wha are he limiaions on dependence on θ?

Linear PHMM - Represenaion Represen dependence on θ as linear movemen of he mean of he Gaussians of he saes: Need o learn W j and µ j for each sae j. (ICCV 98)

Linear PHMM - raining Need o derive EM equaions for linear parameers and proceed as normal:

Linear HMM - esing Derive EM equaions wih respec o θ : We are esing by EM! (i.e. ieraive): Solve for γ k given guess for θ Solve for θ given guess for γ k

How big was he fish?

Poining Poining is he prooypical example of a parameerized gesure. Assuming wo DOF, can parameerize eiher by (x,y) or by (θ,φ). Under linear assumpion mus choose carefully. A generalized non-linear map would allow greaer freedom. (ICCV 99)

Linear poining resuls Tes for boh recogniion and recovery: If prune based on legal θ (MAP via uniform densiy) :

Noise sensiiviy Compare ad hoc procedure wih PHMM parameer recovery (ignoring heir recogniion problem!!).

HMMs and vision HMMs capure sequencing nicely in a probabilisic manner. Moderae ime o rain, fas o es. More when we do aciviy recogniion