Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30
Recap: regresso problems Iput to learg problem: trag data L {( x1, y1),,( x, y)} Istaces gve by feature vector, label x x... x Trag data matrx form: x X x 11 1m 1 1 m x x m m y y1 y y Output: model f : X Y 31
Recap: lear regresso Lear regresso: predcto s weghted sum of features. Model gve by weghts Model defed by parameter vector ( weght vector ), costat term 0 s tegrated to weght vector by addg a costat attrbute x : f f ( x) ( x ) m 0 0 m x 1 costat term attrbute weghts 0 1 x x T. 0... weght vector m x 3
Recap: rdge regresso Trag data L {( x, y ),,( x, y )}. 1 1 Approach: mmze regularzed loss fucto * arg m f ) ( 1 y x ( f ( x ), y ) ( (, ) ) 1/ Quadratc loss fucto ( f ( x ), y) ( f ( x ) y). L-Regularzer ( ). Soluto: * X T X I X T y. 1 33
Iserto: uvarate ormal dstrbuto Dstrbuto over x. Gveby desty fucto wth parameters (mea) ad (varace). Desty of ormal dstrbuto 34
Iserto: multvarate ormal dstrbuto Dstrbuto over vectors Gve by desty fucto wth parameters mea vector p( x z 1) ( x μ, Σ) k x D. covarace matrx 1 1 T 1 exp ( x μ) Σ ( x μ) Z μ D D D, Σ. Normalzer Z D/ 1/ Example D=: desty, samples from dstrbuto 35
Probablstc Lear Regresso Lear regresso as a probablstc model: p y y f ( x) T ( x, ) ( x, ). f ( x) p y y T ( x, ) ( x, ) x 36
Probablstc Lear Regresso Lear regresso as a probablstc model: p y y f ( x) T ( x, ) ( x, ). f ( x) p y y T ( x, ) ( x, ) x T * Label y geerated by lear model f * ( x ) x plus ormally dstrbuted ose: y x T mt ~ ( 0, ). * 37
Most probable model gve the data Goal: most probably model gve the data. Approach: derve a-posteror dstrbuto Lkelhood: Probablty of data, gve model * arg max p( L) pror dstrbuto over parameters p( L) p( L ) p( ) pl ( ) 38
Bayesa lear regresso: lkelhood Lkelhood of data: staces depedet 1 multvarate ormal dstrbuto wth covarace matrx I p( L ) p( y,... y, x, ) 1, x1, 1 py ( x, ) T ( y x, ) y X, I staces x depedet of f ( x ) x T X x,..., 1 x ( )T y ( y,..., 1 y )T X x... x T 1 T vector of predctos 39
Bayesa lear regresso: pror Pror dstrbuto over weght vectors. Approprate pror dstrbuto: ormal dstrbuto. p( ) ( 0, I p p ) 1 1 exp p m/ m p cotrols stregth of pror Normal dstrbuto s cojugate to tself: ormally dstrbuted pror ad ormal lkelhood result ormally dstrbuted posteror. p( ) 0 0 1 40
Bayesa lear regresso: posteror Posteror dstrbuto over models gve the data 1 p( L) p( L ) p( ) Z 1 Z Bayes rule ( y X, I) ( 0, pi) 1 (, A ) Theorem descrbg propertes of ormal dstrbutos mt ( X X I) X T 1 T p data matrx label vector y ud ose parameter A XX T p I varace of pror Posteror dstrbuto over parameter vectors s aga ormally dstrbuted wth ovel mea ad covarace matrx A 1. 41
Summary dervato of posteror Summary dervato of posteror: Approach: Bayes rule p( L) p( L p( pl ( ) Derve lkelhood, choose approprate pror dstrbuto (ormal dstrbuto). Posteror p( L): how probably s lear regresso model after havg see the data L? Computato of posteror s relatvely smple: Pror ( 0, I). p Observato of data L. 1 Resultg posteror (, A ). 4
Bayesa lear regresso: MAP model Posteror over parameter vectors s aga ormally 1 dstrbted, wth ew mea ad covarace matrx A. Most probable model gve the data: * arg max p( L) 1 arg max (, A ) wth ( X X I) X T 1 T p y ad A XX T p I 43
Example MAP soluto regresso Trag data: 0 4 x1 3, x 3, 3 Matrx otato (addg costat attrbute): x 0 1, y1 y 3 y3 4 X 1 3 0 1 4 3 1 0 1 y 3 4 44
Example MAP soluto regresso Choose Varace of pror: Nose parameter: p 1 0.5 T 1 T Compute: ( X X I) X y p 1 T 1 0 0 0 T 1 3 0 1 3 0 1 3 0 0 1 0 0 1 4 3 1 4 3 0.5 1 4 3 3 0 0 1 0 1 0 1 1 0 1 1 0 1 4 0 0 0 1 0.7975-0.5598 0.7543 1.117 45
Example MAP soluto regresso Predctos of model o the trag data: 0.7975 1 3 0 1.9408-0.5598 ˆ y X 1 4 3 3.0646 0.7543 1 0 1 3.795 1.117 46
Coecto to rdge regresso MAP parameter lear regresso: X X y T 1 T ( X I) p Recall: rdge regresso * ( ( ) arg m f ) 1 y x X X 1 I T X T y. MAP soluto of Bayesa regresso detcal to soluto of rdge regresso for. p 47
Coecto to rdge regresso Coecto betwee loss fucto ad lkelhood, regularzer ad pror. MAP model: * arg max p( L) arg max p( L ) p( ) arg max log p( L ) log p( ) arg m log p( L ) l og p( ) Negatve Log-Lkelhood Negatve Log-Pror 48
Coecto to rdge regresso Negatve log-lkelhood correspods to squared loss log p( L ) log p( y x, ) 1 1 1 1 1 log py ( x, ) lo T g ( y x, ) 1 1 log exp ( 1/ y ( ) 1 ( T y ) x cost x T ) 49
Coecto to rdge regresso Negatve log-pror correspods to regularzer log p( ) log ( 0, I) log 1 p p 1 1 exp p m/ m cost MAP soluto correspods to the mmzato of a regularzed loss fucto. 50
Vsualzato: sequetal update of posteror staces depedet Computato of posteror by sequetal updatg: multply lkelhood of dvdual staces p( L) p( ) p( y X, ) p( ) p( y, ) 1 x Multply lkelhood of o pror Let p ( ) p( ) 0, p ( ) k the posteror f oly the frst k staces L are used: y dvdually p( L) p( ) p( y x, ) p( y x, ) p( y x, )... p( y x, ) p ( ) 1 1 1 3 p ( ) p ( ) 3 p ( ) 3 51
Example: sequetal update posteror f ( x) 0 1x (oe dmesoal regresso) Sequetal update: p ( ) p( ) 0 p ( ) p( ) 0 Sample from p0( ) 1 0 5
Example: sequetal update posteror 1 f ( x) x 0 1 Sequetal update: Lkelhood p( y x, ) 1 1 (oe dmesoal regresso) p ( ) p ( ) p( y x, ) 1 0 1 1 P( w) 1 Istace x1, y1 y f ( x ) 1 1 1 x 0 1 1 1 Sample aus P( w) 1 0 x y 0 1 1 1 1 53
Example: sequetal update posteror f ( x) 0 1x (oe dmesoal regresso) Sequetal update: Lkelhood p( y x, ) 1 1 p ( ) p ( ) p( y x, ) 1 0 1 1 Posteror p1 ( ) Sample aus p1 ( ) 1 1 0 0 54
Example: sequetal update posteror f ( x) 0 1x (oe dmesoal regresso) Sequetal update: p( y x, ) p ( ) p ( ) p( y x, ) 1 p ( ) Sample aus p ( ) 1 1 0 0 55
Example: sequetal update posteror f ( x) Sequetal update: p( y x, ) 0 1x (oe dmesoal regresso) p ( ) p ( ) p( y x, ) 1 p ( ) Sample aus p( ) 1 1 0 0 56
Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 57