Clustering using Mixture Models

Size: px
Start display at page:

Download "Clustering using Mixture Models"

Transcription

1 Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior (Dirichlet) parameter prior (Gauss-IW) Given this model, we can create new samples: z i x i N k K 1.Sample from priors, k z i 2.Sample corresp. 3.Sample data point x i!1

2 Repetition: MAP for Regression In MLE, we searched for parameters w, that maximize the data likelihood. Now, we assume a Gaussian prior: Using this, we can compute the posterior (Bayes): p(w x, t) / p(t w, x)p(w) Posterior Likelihood Prior Maximum A-Posteriori Estimation (MAP)!2

3 Generalization: The Bayesian Approach This idea can be generalized: Given a data-dependent likelihood term Find an appropriate prior distribution Multiply both and obtain the (unnormalized) posterior from Bayes rule Main benefit: less overfitting However: How should we define the prior? Often used principle: Conjugacy!3

4 Conjugate Priors A conjugate prior distribution allows to represent the posterior in the same functional (closed) form as the prior, e.g.: N (w; 0, 2 2 2I M ) N (t; w, 1I M ) =N (w; µ, ) Gaussian prior Gaussian likelihood Gaussian posterior Common pairs of likelihood and conjugate priors are: Likelihood Normal with known variance Binomial Multinomial Multivariate Normal Conjugate Prior Normal Beta Dirichlet Normal-inverse Wishart!4

5 <latexit sha1_base64="da5umrqkdd4xxx5f03vzyhv38ma=">aaac8xiczvllattafb2pr1rjw6dddjpugfiwrjkbdhmiyayqelkok0dgeapr2b40dzezcjzch9jd6bzfvojh9morbejckhs49577ospkkzyp499b+otps+cvtl5g2zuvxr/p7b69ckayje+ykczezdrxktsfeoelvyotpyqt/dirjtv45s23thj9za9kplv0rsvmmorblfzuyz2vjkosg++gkj3brikck8zi3k0ufgpsimz49hef4ihkfc50xsrqrvg20rkmjai6jqzly+bhhov8fwyprcntujhimpstoqq0uklvwjrprx+p4rxhxydpqb91dp7ubgxmzjxi2jnjnbto4tjpa2q9yjjdp8rxkrkczvk1qe0vd9n6faagd8ct45mx8gip1977jjoq164lmtd5wm3gwue/2ob+0c/bim6jv599ntzcl5xnmt21n1use4nbdxaulgdergbqzgvsgnmcwso8kpwguitrprsvhdczqt3e6upl3piclpdza/sfgvmg4wggs8cbho4mxl0vlltzveknidjumak0nkpp6fj0lrp8ja6px3etgszjpgipwcv4lmsj5ot+//coe2glvucf0b5k0cd0il6gczrbdp0jwma72ald+d38ef68sw2djvmopbdw119ko+4i</latexit> <latexit sha1_base64="da5umrqkdd4xxx5f03vzyhv38ma=">aaac8xiczvllattafb2pr1rjw6dddjpugfiwrjkbdhmiyayqelkok0dgeapr2b40dzezcjzch9jd6bzfvojh9morbejckhs49577ospkkzyp499b+otps+cvtl5g2zuvxr/p7b69ckayje+ykczezdrxktsfeoelvyotpyqt/dirjtv45s23thj9za9kplv0rsvmmorblfzuyz2vjkosg++gkj3brikck8zi3k0ufgpsimz49hef4ihkfc50xsrqrvg20rkmjai6jqzly+bhhov8fwyprcntujhimpstoqq0uklvwjrprx+p4rxhxydpqb91dp7ubgxmzjxi2jnjnbto4tjpa2q9yjjdp8rxkrkczvk1qe0vd9n6faagd8ct45mx8gip1977jjoq164lmtd5wm3gwue/2ob+0c/bim6jv599ntzcl5xnmt21n1use4nbdxaulgdergbqzgvsgnmcwso8kpwguitrprsvhdczqt3e6upl3piclpdza/sfgvmg4wggs8cbho4mxl0vlltzveknidjumak0nkpp6fj0lrp8ja6px3etgszjpgipwcv4lmsj5ot+//coe2glvucf0b5k0cd0il6gczrbdp0jwma72ald+d38ef68sw2djvmopbdw119ko+4i</latexit> <latexit sha1_base64="da5umrqkdd4xxx5f03vzyhv38ma=">aaac8xiczvllattafb2pr1rjw6dddjpugfiwrjkbdhmiyayqelkok0dgeapr2b40dzezcjzch9jd6bzfvojh9morbejckhs49577ospkkzyp499b+otps+cvtl5g2zuvxr/p7b69ckayje+ykczezdrxktsfeoelvyotpyqt/dirjtv45s23thj9za9kplv0rsvmmorblfzuyz2vjkosg++gkj3brikck8zi3k0ufgpsimz49hef4ihkfc50xsrqrvg20rkmjai6jqzly+bhhov8fwyprcntujhimpstoqq0uklvwjrprx+p4rxhxydpqb91dp7ubgxmzjxi2jnjnbto4tjpa2q9yjjdp8rxkrkczvk1qe0vd9n6faagd8ct45mx8gip1977jjoq164lmtd5wm3gwue/2ob+0c/bim6jv599ntzcl5xnmt21n1use4nbdxaulgdergbqzgvsgnmcwso8kpwguitrprsvhdczqt3e6upl3piclpdza/sfgvmg4wggs8cbho4mxl0vlltzveknidjumak0nkpp6fj0lrp8ja6px3etgszjpgipwcv4lmsj5ot+//coe2glvucf0b5k0cd0il6gczrbdp0jwma72ald+d38ef68sw2djvmopbdw119ko+4i</latexit> <latexit sha1_base64="da5umrqkdd4xxx5f03vzyhv38ma=">aaac8xiczvllattafb2pr1rjw6dddjpugfiwrjkbdhmiyayqelkok0dgeapr2b40dzezcjzch9jd6bzfvojh9morbejckhs49577ospkkzyp499b+otps+cvtl5g2zuvxr/p7b69ckayje+ykczezdrxktsfeoelvyotpyqt/dirjtv45s23thj9za9kplv0rsvmmorblfzuyz2vjkosg++gkj3brikck8zi3k0ufgpsimz49hef4ihkfc50xsrqrvg20rkmjai6jqzly+bhhov8fwyprcntujhimpstoqq0uklvwjrprx+p4rxhxydpqb91dp7ubgxmzjxi2jnjnbto4tjpa2q9yjjdp8rxkrkczvk1qe0vd9n6faagd8ct45mx8gip1977jjoq164lmtd5wm3gwue/2ob+0c/bim6jv599ntzcl5xnmt21n1use4nbdxaulgdergbqzgvsgnmcwso8kpwguitrprsvhdczqt3e6upl3piclpdza/sfgvmg4wggs8cbho4mxl0vlltzveknidjumak0nkpp6fj0lrp8ja6px3etgszjpgipwcv4lmsj5ot+//coe2glvucf0b5k0cd0il6gczrbdp0jwma72ald+d38ef68sw2djvmopbdw119ko+4i</latexit> <latexit sha1_base64="soy2c739gkbfi5xb+5z7zieclcs=">aaaci3icbvhbattaef2rbs5uk9rty16wmkafjjgcqkvoq2gpfeoggtgjxmasvmn7yv7e7qi1efquvrbf1l/pynfd4mrgl8pmnlmcsxotasbxv1b07pmlre2d3fblv3v7rzvdn5fbfv7csdrt/huqamhlyyqknvznhorjnvylt1/r+nvp8ee5e4grhczgzk2aksmqxnnod5yradifzw5dv8y/pp1epijxxh+dpae91tjztntkis4laxalfihcjhgok1j4vfjd1r4xaxihb8ucbghaysbmyvxsft8kt8znztozynfe+4xsmbbwjqvmi3arnmo188kyluucyam9zj5nsmxzashku+6zqnn0vfahz8qdrl0iikrxtacxc+gfrnlwqava8kzvc02/dkzeptjcpoaho9313bf9yyzqcx7isrxsj/bmyn5u4cogfjpqtmcbulrcuqpyvcxpbxq1ppufqhzgvzuukmze4dg4ha6sejccf+idfgnus8mo2dv2nixsizth39kzgzhjfrhf7a/7g+1fr9fx9pkunwo1nlfsguxf/gm/+siy</latexit> <latexit sha1_base64="soy2c739gkbfi5xb+5z7zieclcs=">aaaci3icbvhbattaef2rbs5uk9rty16wmkafjjgcqkvoq2gpfeoggtgjxmasvmn7yv7e7qi1efquvrbf1l/pynfd4mrgl8pmnlmcsxotasbxv1b07pmlre2d3fblv3v7rzvdn5fbfv7csdrt/huqamhlyyqknvznhorjnvylt1/r+nvp8ee5e4grhczgzk2aksmqxnnod5yradifzw5dv8y/pp1epijxxh+dpae91tjztntkis4laxalfihcjhgok1j4vfjd1r4xaxihb8ucbghaysbmyvxsft8kt8znztozynfe+4xsmbbwjqvmi3arnmo188kyluucyam9zj5nsmxzashku+6zqnn0vfahz8qdrl0iikrxtacxc+gfrnlwqava8kzvc02/dkzeptjcpoaho9313bf9yyzqcx7isrxsj/bmyn5u4cogfjpqtmcbulrcuqpyvcxpbxq1ppufqhzgvzuukmze4dg4ha6sejccf+idfgnus8mo2dv2nixsizth39kzgzhjfrhf7a/7g+1fr9fx9pkunwo1nlfsguxf/gm/+siy</latexit> <latexit sha1_base64="soy2c739gkbfi5xb+5z7zieclcs=">aaaci3icbvhbattaef2rbs5uk9rty16wmkafjjgcqkvoq2gpfeoggtgjxmasvmn7yv7e7qi1efquvrbf1l/pynfd4mrgl8pmnlmcsxotasbxv1b07pmlre2d3fblv3v7rzvdn5fbfv7csdrt/huqamhlyyqknvznhorjnvylt1/r+nvp8ee5e4grhczgzk2aksmqxnnod5yradifzw5dv8y/pp1epijxxh+dpae91tjztntkis4laxalfihcjhgok1j4vfjd1r4xaxihb8ucbghaysbmyvxsft8kt8znztozynfe+4xsmbbwjqvmi3arnmo188kyluucyam9zj5nsmxzashku+6zqnn0vfahz8qdrl0iikrxtacxc+gfrnlwqava8kzvc02/dkzeptjcpoaho9313bf9yyzqcx7isrxsj/bmyn5u4cogfjpqtmcbulrcuqpyvcxpbxq1ppufqhzgvzuukmze4dg4ha6sejccf+idfgnus8mo2dv2nixsizth39kzgzhjfrhf7a/7g+1fr9fx9pkunwo1nlfsguxf/gm/+siy</latexit> <latexit sha1_base64="soy2c739gkbfi5xb+5z7zieclcs=">aaaci3icbvhbattaef2rbs5uk9rty16wmkafjjgcqkvoq2gpfeoggtgjxmasvmn7yv7e7qi1efquvrbf1l/pynfd4mrgl8pmnlmcsxotasbxv1b07pmlre2d3fblv3v7rzvdn5fbfv7csdrt/huqamhlyyqknvznhorjnvylt1/r+nvp8ee5e4grhczgzk2aksmqxnnod5yradifzw5dv8y/pp1epijxxh+dpae91tjztntkis4laxalfihcjhgok1j4vfjd1r4xaxihb8ucbghaysbmyvxsft8kt8znztozynfe+4xsmbbwjqvmi3arnmo188kyluucyam9zj5nsmxzashku+6zqnn0vfahz8qdrl0iikrxtacxc+gfrnlwqava8kzvc02/dkzeptjcpoaho9313bf9yyzqcx7isrxsj/bmyn5u4cogfjpqtmcbulrcuqpyvcxpbxq1ppufqhzgvzuukmze4dg4ha6sejccf+idfgnus8mo2dv2nixsizth39kzgzhjfrhf7a/7g+1fr9fx9pkunwo1nlfsguxf/gm/+siy</latexit> <latexit sha1_base64="qovetvpth3k4l+4fmo3l0xzfjkg=">aaac2xiczvjbi9nafj7g21pvxx30zbauvpcsfefffhz9ewrlbbu72lrhmpm2q+bgzim0hnnwtxz1l/kn/au+6g9w0g3ltnsgytfno5ec801ubhcqx7870y2bt27f2bvbvxf/wcnhvf3hp05xlrix1ulb85w4jrhiy+ag2lmxjmhcslo8fnvwz1+ydvyrt7a2bcrjqve5pwsck+vnzmhnvpicp7kwhvvl8kltw/1zfihty3wr1eow8bmp7afsdu9twwxlrp4aynl7y8gtumalz3r9ebhvdf8hsqv6qlwtbl9d00ltsjifvbdnjklsyfotc5wk5rtp5zghtcqlnglqecnctn4swunb8br4rm14focn92pgtarrzgyrksds7xkn85ibxcvh1vr0o/1h/npac2uqyipetj9xaopgza5xws2jinybegp5madtjbgeqlbkq1ijx/hcvik8qzbn9e5vmmewfwf4sdahfslhzgm8wgetyueot3+m2yqykwdehbrd1dggulkhux1mvscelf+f2+b8pyp9n8is7ipwhzyohkk8td6+7b+9aqxaq0/rm3saevqkhaf36asneuw/0b/0f/2ljtg36hv04yi06rq5t9cwrt//a4fk6zk=</latexit> <latexit sha1_base64="qovetvpth3k4l+4fmo3l0xzfjkg=">aaac2xiczvjbi9nafj7g21pvxx30zbauvpcsfefffhz9ewrlbbu72lrhmpm2q+bgzim0hnnwtxz1l/kn/au+6g9w0g3ltnsgytfno5ec801ubhcqx7870y2bt27f2bvbvxf/wcnhvf3hp05xlrix1ulb85w4jrhiy+ag2lmxjmhcslo8fnvwz1+ydvyrt7a2bcrjqve5pwsck+vnzmhnvpicp7kwhvvl8kltw/1zfihty3wr1eow8bmp7afsdu9twwxlrp4aynl7y8gtumalz3r9ebhvdf8hsqv6qlwtbl9d00ltsjifvbdnjklsyfotc5wk5rtp5zghtcqlnglqecnctn4swunb8br4rm14focn92pgtarrzgyrksds7xkn85ibxcvh1vr0o/1h/npac2uqyipetj9xaopgza5xws2jinybegp5madtjbgeqlbkq1ijx/hcvik8qzbn9e5vmmewfwf4sdahfslhzgm8wgetyueot3+m2yqykwdehbrd1dggulkhux1mvscelf+f2+b8pyp9n8is7ipwhzyohkk8td6+7b+9aqxaq0/rm3saevqkhaf36asneuw/0b/0f/2ljtg36hv04yi06rq5t9cwrt//a4fk6zk=</latexit> <latexit sha1_base64="qovetvpth3k4l+4fmo3l0xzfjkg=">aaac2xiczvjbi9nafj7g21pvxx30zbauvpcsfefffhz9ewrlbbu72lrhmpm2q+bgzim0hnnwtxz1l/kn/au+6g9w0g3ltnsgytfno5ec801ubhcqx7870y2bt27f2bvbvxf/wcnhvf3hp05xlrix1ulb85w4jrhiy+ag2lmxjmhcslo8fnvwz1+ydvyrt7a2bcrjqve5pwsck+vnzmhnvpicp7kwhvvl8kltw/1zfihty3wr1eow8bmp7afsdu9twwxlrp4aynl7y8gtumalz3r9ebhvdf8hsqv6qlwtbl9d00ltsjifvbdnjklsyfotc5wk5rtp5zghtcqlnglqecnctn4swunb8br4rm14focn92pgtarrzgyrksds7xkn85ibxcvh1vr0o/1h/npac2uqyipetj9xaopgza5xws2jinybegp5madtjbgeqlbkq1ijx/hcvik8qzbn9e5vmmewfwf4sdahfslhzgm8wgetyueot3+m2yqykwdehbrd1dggulkhux1mvscelf+f2+b8pyp9n8is7ipwhzyohkk8td6+7b+9aqxaq0/rm3saevqkhaf36asneuw/0b/0f/2ljtg36hv04yi06rq5t9cwrt//a4fk6zk=</latexit> <latexit sha1_base64="qovetvpth3k4l+4fmo3l0xzfjkg=">aaac2xiczvjbi9nafj7g21pvxx30zbauvpcsfefffhz9ewrlbbu72lrhmpm2q+bgzim0hnnwtxz1l/kn/au+6g9w0g3ltnsgytfno5ec801ubhcqx7870y2bt27f2bvbvxf/wcnhvf3hp05xlrix1ulb85w4jrhiy+ag2lmxjmhcslo8fnvwz1+ydvyrt7a2bcrjqve5pwsck+vnzmhnvpicp7kwhvvl8kltw/1zfihty3wr1eow8bmp7afsdu9twwxlrp4aynl7y8gtumalz3r9ebhvdf8hsqv6qlwtbl9d00ltsjifvbdnjklsyfotc5wk5rtp5zghtcqlnglqecnctn4swunb8br4rm14focn92pgtarrzgyrksds7xkn85ibxcvh1vr0o/1h/npac2uqyipetj9xaopgza5xws2jinybegp5madtjbgeqlbkq1ijx/hcvik8qzbn9e5vmmewfwf4sdahfslhzgm8wgetyueot3+m2yqykwdehbrd1dggulkhux1mvscelf+f2+b8pyp9n8is7ipwhzyohkk8td6+7b+9aqxaq0/rm3saevqkhaf36asneuw/0b/0f/2ljtg36hv04yi06rq5t9cwrt//a4fk6zk=</latexit> <latexit sha1_base64="cmgeov8uuacy+odsgvfvz9ncmzk=">aaacj3icbvflixnbeo6mrzw+eswtl8aw4ehctfhylytbl4osrgb2f3bi0nntszr0y+iqkyrhfoxx/ux+g3uyc3czfntzuy+vqr7ks62q4vhpl7pz9979bwcp+48ep3n6bdb8fo6u8hjm0mnnl3obojwfgsnscfl6ecbxcjgvp7bxix/gutn7jbylzi1ywrvqulbwzyoxkvymq9cnsfp9s1qqbm1pejinrve43hm/dziojfhnz9mwl6efk5ubs1ilxksklmlec09kamj6ayvqcrkws7gk0aodok938zf8mhgkvna+pet85/23ohygcwvykgkerxa/1jr/g6nns4h77wnxbl4rw1yevl53x1sak+otqrxqhitpbqbcehuw4hilvjaudlzb1ipevpwvdr90pl0ebwvy8fce3fxshfkvmudd+sepqgr9khcd846fk4skdodopwgkxwvdo/pube4febujn8smnsrnp1wl2b/bbxa+gsfxopl6njp+6o5zwf6x1+wns9gxm7jp7iznmgq1+8l+sd/rmdqo3kft69so19w8ydcs+vwxwoxj/w==</latexit> <latexit sha1_base64="cmgeov8uuacy+odsgvfvz9ncmzk=">aaacj3icbvflixnbeo6mrzw+eswtl8aw4ehctfhylytbl4osrgb2f3bi0nntszr0y+iqkyrhfoxx/ux+g3uyc3czfntzuy+vqr7ks62q4vhpl7pz9979bwcp+48ep3n6bdb8fo6u8hjm0mnnl3obojwfgsnscfl6ecbxcjgvp7bxix/gutn7jbylzi1ywrvqulbwzyoxkvymq9cnsfp9s1qqbm1pejinrve43hm/dziojfhnz9mwl6efk5ubs1ilxksklmlec09kamj6ayvqcrkws7gk0aodok938zf8mhgkvna+pet85/23ohygcwvykgkerxa/1jr/g6nns4h77wnxbl4rw1yevl53x1sak+otqrxqhitpbqbcehuw4hilvjaudlzb1ipevpwvdr90pl0ebwvy8fce3fxshfkvmudd+sepqgr9khcd846fk4skdodopwgkxwvdo/pube4febujn8smnsrnp1wl2b/bbxa+gsfxopl6njp+6o5zwf6x1+wns9gxm7jp7iznmgq1+8l+sd/rmdqo3kft69so19w8ydcs+vwxwoxj/w==</latexit> <latexit sha1_base64="cmgeov8uuacy+odsgvfvz9ncmzk=">aaacj3icbvflixnbeo6mrzw+eswtl8aw4ehctfhylytbl4osrgb2f3bi0nntszr0y+iqkyrhfoxx/ux+g3uyc3czfntzuy+vqr7ks62q4vhpl7pz9979bwcp+48ep3n6bdb8fo6u8hjm0mnnl3obojwfgsnscfl6ecbxcjgvp7bxix/gutn7jbylzi1ywrvqulbwzyoxkvymq9cnsfp9s1qqbm1pejinrve43hm/dziojfhnz9mwl6efk5ubs1ilxksklmlec09kamj6ayvqcrkws7gk0aodok938zf8mhgkvna+pet85/23ohygcwvykgkerxa/1jr/g6nns4h77wnxbl4rw1yevl53x1sak+otqrxqhitpbqbcehuw4hilvjaudlzb1ipevpwvdr90pl0ebwvy8fce3fxshfkvmudd+sepqgr9khcd846fk4skdodopwgkxwvdo/pube4febujn8smnsrnp1wl2b/bbxa+gsfxopl6njp+6o5zwf6x1+wns9gxm7jp7iznmgq1+8l+sd/rmdqo3kft69so19w8ydcs+vwxwoxj/w==</latexit> <latexit sha1_base64="cmgeov8uuacy+odsgvfvz9ncmzk=">aaacj3icbvflixnbeo6mrzw+eswtl8aw4ehctfhylytbl4osrgb2f3bi0nntszr0y+iqkyrhfoxx/ux+g3uyc3czfntzuy+vqr7ks62q4vhpl7pz9979bwcp+48ep3n6bdb8fo6u8hjm0mnnl3obojwfgsnscfl6ecbxcjgvp7bxix/gutn7jbylzi1ywrvqulbwzyoxkvymq9cnsfp9s1qqbm1pejinrve43hm/dziojfhnz9mwl6efk5ubs1ilxksklmlec09kamj6ayvqcrkws7gk0aodok938zf8mhgkvna+pet85/23ohygcwvykgkerxa/1jr/g6nns4h77wnxbl4rw1yevl53x1sak+otqrxqhitpbqbcehuw4hilvjaudlzb1ipevpwvdr90pl0ebwvy8fce3fxshfkvmudd+sepqgr9khcd846fk4skdodopwgkxwvdo/pube4febujn8smnsrnp1wl2b/bbxa+gsfxopl6njp+6o5zwf6x1+wns9gxm7jp7iznmgq1+8l+sd/rmdqo3kft69so19w8ydcs+vwxwoxj/w==</latexit> Multinomial Given K clusters and probabilities of these clusters 1,..., K where The probability that out of N samples mk are in KX k=1 k = 1 cluster k is: p(m 1,...,m K,N)= N Y K m 1 m K k=1 µ m k k This is called the multinomial distribution In our case: NY KY KY p(z ) = n=1 k=1 µ z nk k = k=1 µ m k k!5

6 <latexit sha1_base64="lcpm3xu1pawahu6ubu7n5ufuq+y=">aaacgxiczvhbattaef0rvatulwkf+7lugfoorjkbfvos2pe+bfkoe0nszgg1tpfsreymio3qt/s1/bh+tveokikzsmthlmdmzusl0crp+rexhdx4+ojx4zp+02fpx7w8on51qb4kcifkgx+moraa7xdcmg1oy4bgc4ox+fxxnn75ewnp737wtss5hzxts62ao2s6a1ouyzetjgbpkn2zva+ydgxez+el456afv5vfh0ra0rxwvryvibawhls+roksar1dsu8itcbrzrxu4eboyyeqi59im+x3hlvv9rgiby2j5kwee37sdb5pza8herny0h7/xn5av5rv1amtt20x1zgspetjrlqarwbbqsggo4bslwgaiqjcneywpmld6ey8vfettutq2yoayu4vfn5wl62y2ykhmoosxsizdex7fikdsrgikd/rsjkvy42qs9gcwyc9czejzp6ndb9ejzs/wj3wcv4lkwj7pvj4prld6bd8ua8fe9ejj6ku/fnniujumkix+k3+jmcjo+tnbnfpca9rua1ugpj539tmsuy</latexit> <latexit sha1_base64="lcpm3xu1pawahu6ubu7n5ufuq+y=">aaacgxiczvhbattaef0rvatulwkf+7lugfoorjkbfvos2pe+bfkoe0nszgg1tpfsreymio3qt/s1/bh+tveokikzsmthlmdmzusl0crp+rexhdx4+ojx4zp+02fpx7w8on51qb4kcifkgx+moraa7xdcmg1oy4bgc4ox+fxxnn75ewnp737wtss5hzxts62ao2s6a1ouyzetjgbpkn2zva+ydgxez+el456afv5vfh0ra0rxwvryvibawhls+roksar1dsu8itcbrzrxu4eboyyeqi59im+x3hlvv9rgiby2j5kwee37sdb5pza8herny0h7/xn5av5rv1amtt20x1zgspetjrlqarwbbqsggo4bslwgaiqjcneywpmld6ey8vfettutq2yoayu4vfn5wl62y2ykhmoosxsizdex7fikdsrgikd/rsjkvy42qs9gcwyc9czejzp6ndb9ejzs/wj3wcv4lkwj7pvj4prld6bd8ua8fe9ejj6ku/fnniujumkix+k3+jmcjo+tnbnfpca9rua1ugpj539tmsuy</latexit> <latexit sha1_base64="lcpm3xu1pawahu6ubu7n5ufuq+y=">aaacgxiczvhbattaef0rvatulwkf+7lugfoorjkbfvos2pe+bfkoe0nszgg1tpfsreymio3qt/s1/bh+tveokikzsmthlmdmzusl0crp+rexhdx4+ojx4zp+02fpx7w8on51qb4kcifkgx+moraa7xdcmg1oy4bgc4ox+fxxnn75ewnp737wtss5hzxts62ao2s6a1ouyzetjgbpkn2zva+ydgxez+el456afv5vfh0ra0rxwvryvibawhls+roksar1dsu8itcbrzrxu4eboyyeqi59im+x3hlvv9rgiby2j5kwee37sdb5pza8herny0h7/xn5av5rv1amtt20x1zgspetjrlqarwbbqsggo4bslwgaiqjcneywpmld6ey8vfettutq2yoayu4vfn5wl62y2ykhmoosxsizdex7fikdsrgikd/rsjkvy42qs9gcwyc9czejzp6ndb9ejzs/wj3wcv4lkwj7pvj4prld6bd8ua8fe9ejj6ku/fnniujumkix+k3+jmcjo+tnbnfpca9rua1ugpj539tmsuy</latexit> <latexit sha1_base64="lcpm3xu1pawahu6ubu7n5ufuq+y=">aaacgxiczvhbattaef0rvatulwkf+7lugfoorjkbfvos2pe+bfkoe0nszgg1tpfsreymio3qt/s1/bh+tveokikzsmthlmdmzusl0crp+rexhdx4+ojx4zp+02fpx7w8on51qb4kcifkgx+moraa7xdcmg1oy4bgc4ox+fxxnn75ewnp737wtss5hzxts62ao2s6a1ouyzetjgbpkn2zva+ydgxez+el456afv5vfh0ra0rxwvryvibawhls+roksar1dsu8itcbrzrxu4eboyyeqi59im+x3hlvv9rgiby2j5kwee37sdb5pza8herny0h7/xn5av5rv1amtt20x1zgspetjrlqarwbbqsggo4bslwgaiqjcneywpmld6ey8vfettutq2yoayu4vfn5wl62y2ykhmoosxsizdex7fikdsrgikd/rsjkvy42qs9gcwyc9czejzp6ndb9ejzs/wj3wcv4lkwj7pvj4prld6bd8ua8fe9ejj6ku/fnniujumkix+k3+jmcjo+tnbnfpca9rua1ugpj539tmsuy</latexit> <latexit sha1_base64="4mdsnwv8ivffleyjrgnp9ijls6a=">aaacgxiczvhbattaef0rvsro29we+7lugfoorhkbfposmpe8bfkie0nszgg1tpfsreyoio3qt/s1/bh8tvaocikzsmthlmdmzmsfkp7i+l4tbb k5398pht3v7b4fx3pzo4fbyzd0oa49kghysjiwjwihotofndnfwxg/+oppsmitaftjrmddyjgvqci3goioftnppfi8exgvjr0hsgh5r7xj60bhj3ipsoyghwpvbjc5ouoejkrtw3xhpsqbxb3o8ddcarj+p1gpxvb88oz9zf54hvvy+r6hae7/swcjuqau/gwuct7h+8yatg0a/0z9mpyavnevjamrj+1mpofneamjz6vcqwguawsmwarclccaokpecqze5/+5kfx5hdbo9n6xo0geelldzg8oxoswa8z4psgsbpg8n5i0ll8ytqmdrhxskyustgluxslwacnizrubrko3rbjhlsnme1+a6hstxipl93dv91r5om31mx9hxlratdsro2submseu+8v+sf/rvvqtiqp0mtxqtdvh7ivfpx8avujfmw==</latexit> <latexit sha1_base64="4mdsnwv8ivffleyjrgnp9ijls6a=">aaacgxiczvhbattaef0rvsro29we+7lugfoorhkbfposmpe8bfkie0nszgg1tpfsreyoio3qt/s1/bh8tvaocikzsmthlmdmzmsfkp7i+l4tbb k5398pht3v7b4fx3pzo4fbyzd0oa49kghysjiwjwihotofndnfwxg/+oppsmitaftjrmddyjgvqci3goioftnppfi8exgvjr0hsgh5r7xj60bhj3ipsoyghwpvbjc5ouoejkrtw3xhpsqbxb3o8ddcarj+p1gpxvb88oz9zf54hvvy+r6hae7/swcjuqau/gwuct7h+8yatg0a/0z9mpyavnevjamrj+1mpofneamjz6vcqwguawsmwarclccaokpecqze5/+5kfx5hdbo9n6xo0geelldzg8oxoswa8z4psgsbpg8n5i0ll8ytqmdrhxskyustgluxslwacnizrubrko3rbjhlsnme1+a6hstxipl93dv91r5om31mx9hxlratdsro2submseu+8v+sf/rvvqtiqp0mtxqtdvh7ivfpx8avujfmw==</latexit> <latexit sha1_base64="4mdsnwv8ivffleyjrgnp9ijls6a=">aaacgxiczvhbattaef0rvsro29we+7lugfoorhkbfposmpe8bfkie0nszgg1tpfsreyoio3qt/s1/bh8tvaocikzsmthlmdmzmsfkp7i+l4tbb k5398pht3v7b4fx3pzo4fbyzd0oa49kghysjiwjwihotofndnfwxg/+oppsmitaftjrmddyjgvqci3goioftnppfi8exgvjr0hsgh5r7xj60bhj3ipsoyghwpvbjc5ouoejkrtw3xhpsqbxb3o8ddcarj+p1gpxvb88oz9zf54hvvy+r6hae7/swcjuqau/gwuct7h+8yatg0a/0z9mpyavnevjamrj+1mpofneamjz6vcqwguawsmwarclccaokpecqze5/+5kfx5hdbo9n6xo0geelldzg8oxoswa8z4psgsbpg8n5i0ll8ytqmdrhxskyustgluxslwacnizrubrko3rbjhlsnme1+a6hstxipl93dv91r5om31mx9hxlratdsro2submseu+8v+sf/rvvqtiqp0mtxqtdvh7ivfpx8avujfmw==</latexit> <latexit sha1_base64="4mdsnwv8ivffleyjrgnp9ijls6a=">aaacgxiczvhbattaef0rvsro29we+7lugfoorhkbfposmpe8bfkie0nszgg1tpfsreyoio3qt/s1/bh8tvaocikzsmthlmdmzmsfkp7i+l4tbb k5398pht3v7b4fx3pzo4fbyzd0oa49kghysjiwjwihotofndnfwxg/+oppsmitaftjrmddyjgvqci3goioftnppfi8exgvjr0hsgh5r7xj60bhj3ipsoyghwpvbjc5ouoejkrtw3xhpsqbxb3o8ddcarj+p1gpxvb88oz9zf54hvvy+r6hae7/swcjuqau/gwuct7h+8yatg0a/0z9mpyavnevjamrj+1mpofneamjz6vcqwguawsmwarclccaokpecqze5/+5kfx5hdbo9n6xo0geelldzg8oxoswa8z4psgsbpg8n5i0ll8ytqmdrhxskyustgluxslwacnizrubrko3rbjhlsnme1+a6hstxipl93dv91r5om31mx9hxlratdsro2submseu+8v+sf/rvvqtiqp0mtxqtdvh7ivfpx8avujfmw==</latexit> <latexit sha1_base64="q0w9aqecgaqwpdvvmafm+hwy8dy=">aaacgxiczvhbihnbeo3melnjbvcffwkmaqujm1fq8gxrf18wvjc7gu0int2vpnm+dn01kjdmt/iqp+bfwjmdzddb0m2hlqeqtuwl0zhs9g8vobh3/8hdw0f9x0+epnt+dpzippoqkjwob3yy5hdraict0mrwwgyemxu8yk++tvglnxii9u4hbuucw1g5vdqkif3tgzhydyv3i6nbokp3ju+cramd0dnz4rinzovxluvhykcml1la0rygqfozbpqzkmij6gpwemnqgcu4r3cdn3linkiufednso68nytqsdfubc6zfmgd92ot839sednim5yx7vwn5ad5rv1zetp13x5zguletpriqgduzlymqaxng0i1hgckwllbtk3mxbtqgf6vt+320vu2x4afl29wnsvxdoynlepjsrbauxyty45fahcjdhp0zxfj+cpxo/ounqdaqw/4argpx2nt57nk+0e4c87hoywdzd8/de6+dac6fk/ea/fgzokjobhfxjmyccwm+cv+iz/jqfi2szpxdwrs62peiluwfp4hv17fna==</latexit> <latexit sha1_base64="q0w9aqecgaqwpdvvmafm+hwy8dy=">aaacgxiczvhbihnbeo3melnjbvcffwkmaqujm1fq8gxrf18wvjc7gu0int2vpnm+dn01kjdmt/iqp+bfwjmdzddb0m2hlqeqtuwl0zhs9g8vobh3/8hdw0f9x0+epnt+dpzippoqkjwob3yy5hdraict0mrwwgyemxu8yk++tvglnxii9u4hbuucw1g5vdqkif3tgzhydyv3i6nbokp3ju+cramd0dnz4rinzovxluvhykcml1la0rygqfozbpqzkmij6gpwemnqgcu4r3cdn3linkiufednso68nytqsdfubc6zfmgd92ot839sednim5yx7vwn5ad5rv1zetp13x5zguletpriqgduzlymqaxng0i1hgckwllbtk3mxbtqgf6vt+320vu2x4afl29wnsvxdoynlepjsrbauxyty45fahcjdhp0zxfj+cpxo/ounqdaqw/4argpx2nt57nk+0e4c87hoywdzd8/de6+dac6fk/ea/fgzokjobhfxjmyccwm+cv+iz/jqfi2szpxdwrs62peiluwfp4hv17fna==</latexit> <latexit sha1_base64="q0w9aqecgaqwpdvvmafm+hwy8dy=">aaacgxiczvhbihnbeo3melnjbvcffwkmaqujm1fq8gxrf18wvjc7gu0int2vpnm+dn01kjdmt/iqp+bfwjmdzddb0m2hlqeqtuwl0zhs9g8vobh3/8hdw0f9x0+epnt+dpzippoqkjwob3yy5hdraict0mrwwgyemxu8yk++tvglnxii9u4hbuucw1g5vdqkif3tgzhydyv3i6nbokp3ju+cramd0dnz4rinzovxluvhykcml1la0rygqfozbpqzkmij6gpwemnqgcu4r3cdn3linkiufednso68nytqsdfubc6zfmgd92ot839sednim5yx7vwn5ad5rv1zetp13x5zguletpriqgduzlymqaxng0i1hgckwllbtk3mxbtqgf6vt+320vu2x4afl29wnsvxdoynlepjsrbauxyty45fahcjdhp0zxfj+cpxo/ounqdaqw/4argpx2nt57nk+0e4c87hoywdzd8/de6+dac6fk/ea/fgzokjobhfxjmyccwm+cv+iz/jqfi2szpxdwrs62peiluwfp4hv17fna==</latexit> <latexit sha1_base64="q0w9aqecgaqwpdvvmafm+hwy8dy=">aaacgxiczvhbihnbeo3melnjbvcffwkmaqujm1fq8gxrf18wvjc7gu0int2vpnm+dn01kjdmt/iqp+bfwjmdzddb0m2hlqeqtuwl0zhs9g8vobh3/8hdw0f9x0+epnt+dpzippoqkjwob3yy5hdraict0mrwwgyemxu8yk++tvglnxii9u4hbuucw1g5vdqkif3tgzhydyv3i6nbokp3ju+cramd0dnz4rinzovxluvhykcml1la0rygqfozbpqzkmij6gpwemnqgcu4r3cdn3linkiufednso68nytqsdfubc6zfmgd92ot839sednim5yx7vwn5ad5rv1zetp13x5zguletpriqgduzlymqaxng0i1hgckwllbtk3mxbtqgf6vt+320vu2x4afl29wnsvxdoynlepjsrbauxyty45fahcjdhp0zxfj+cpxo/ounqdaqw/4argpx2nt57nk+0e4c87hoywdzd8/de6+dac6fk/ea/fgzokjobhfxjmyccwm+cv+iz/jqfi2szpxdwrs62peiluwfp4hv17fna==</latexit> The Dirichlet Distribution The Dirichlet distribution is defined as: Dir(µ ) = ( 0 ) ( 1 ) ( K ) KX KY k=1 µ k 1 k 0 = KX k=1 k 0 apple µ k apple 1 k=1 µ k =1 2 It is the conjugate prior for the multinomial distribution There, the parameter can 3 be interpreted as the effective 1 number of observations for every state The simplex for K=3!6

7 Some Examples = (2, 2, 2) = (20, 2, 2) 0 controls the strength α=0.10 of the distribution ( peakedness ) k control the location of the peak 15 p = (0.1, 0.1, 0.1)!7

8 Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior (Dirichlet) parameter prior (Gauss-IW) Given this model, we can create new samples: z i x i N k K 1.Sample from priors, k z i 2.Sample corresp. 3.Sample data point x i!8

9 <latexit sha1_base64="ut0p6/7pyrti/xm+gpymshuvius=">aaactniczvfla9taef6rr9r9oe2xl6xgkeixkik0ketol+0hjyu6dksuwa3g9uj9in1rsbh6mf01vbbh/puohffiz2b3p76z+wznjiu0chjhfzvrvfsphj46enx98vtz8xe9w5cxwzvewlg67fxljgjozwgmcjvcfh6eytrmstwnxj+5bh+us99xu8duiivvcyufejxrnasz03nyghqqfjeaop6tebqu4smr6+rrl0l9tbolst4x9dtzrx8p463xuybpqz+1dj477mg0d7i0yffqecjvehc4ryrhjtxu3bqmuai5egu4imifgtcttl3wfebmzufo07hit+ztjeqy0hyrio3azdj3ner/3+c2e9enytirj/pjaavsusjyevn+xmqojjed5lnyiffvcajpfxxa5vj4izhgvapu7cz/50tnt3sm6t7y0mtgiafm9cjr+tkmooz8wgksnkda2x/zvourg1bo0uimavc60lkh6kyszwr6tazvh7oaxxwx1plsl+euubgnk3iyfhvfp/3yluiavwzv2bfl2ad2yj6zczzmkv1kv9hv9ic6jn5eec1uqqnom/ok7vhu/anbddus</latexit> <latexit sha1_base64="ut0p6/7pyrti/xm+gpymshuvius=">aaactniczvfla9taef6rr9r9oe2xl6xgkeixkik0ketol+0hjyu6dksuwa3g9uj9in1rsbh6mf01vbbh/puohffiz2b3p76z+wznjiu0chjhfzvrvfsphj46enx98vtz8xe9w5cxwzvewlg67fxljgjozwgmcjvcfh6eytrmstwnxj+5bh+us99xu8duiivvcyufejxrnasz03nyghqqfjeaop6tebqu4smr6+rrl0l9tbolst4x9dtzrx8p463xuybpqz+1dj477mg0d7i0yffqecjvehc4ryrhjtxu3bqmuai5egu4imifgtcttl3wfebmzufo07hit+ztjeqy0hyrio3azdj3ner/3+c2e9enytirj/pjaavsusjyevn+xmqojjed5lnyiffvcajpfxxa5vj4izhgvapu7cz/50tnt3sm6t7y0mtgiafm9cjr+tkmooz8wgksnkda2x/zvourg1bo0uimavc60lkh6kyszwr6tazvh7oaxxwx1plsl+euubgnk3iyfhvfp/3yluiavwzv2bfl2ad2yj6zczzmkv1kv9hv9ic6jn5eec1uqqnom/ok7vhu/anbddus</latexit> <latexit sha1_base64="ut0p6/7pyrti/xm+gpymshuvius=">aaactniczvfla9taef6rr9r9oe2xl6xgkeixkik0ketol+0hjyu6dksuwa3g9uj9in1rsbh6mf01vbbh/puohffiz2b3p76z+wznjiu0chjhfzvrvfsphj46enx98vtz8xe9w5cxwzvewlg67fxljgjozwgmcjvcfh6eytrmstwnxj+5bh+us99xu8duiivvcyufejxrnasz03nyghqqfjeaop6tebqu4smr6+rrl0l9tbolst4x9dtzrx8p463xuybpqz+1dj477mg0d7i0yffqecjvehc4ryrhjtxu3bqmuai5egu4imifgtcttl3wfebmzufo07hit+ztjeqy0hyrio3azdj3ner/3+c2e9enytirj/pjaavsusjyevn+xmqojjed5lnyiffvcajpfxxa5vj4izhgvapu7cz/50tnt3sm6t7y0mtgiafm9cjr+tkmooz8wgksnkda2x/zvourg1bo0uimavc60lkh6kyszwr6tazvh7oaxxwx1plsl+euubgnk3iyfhvfp/3yluiavwzv2bfl2ad2yj6zczzmkv1kv9hv9ic6jn5eec1uqqnom/ok7vhu/anbddus</latexit> <latexit sha1_base64="ut0p6/7pyrti/xm+gpymshuvius=">aaactniczvfla9taef6rr9r9oe2xl6xgkeixkik0ketol+0hjyu6dksuwa3g9uj9in1rsbh6mf01vbbh/puohffiz2b3p76z+wznjiu0chjhfzvrvfsphj46enx98vtz8xe9w5cxwzvewlg67fxljgjozwgmcjvcfh6eytrmstwnxj+5bh+us99xu8duiivvcyufejxrnasz03nyghqqfjeaop6tebqu4smr6+rrl0l9tbolst4x9dtzrx8p463xuybpqz+1dj477mg0d7i0yffqecjvehc4ryrhjtxu3bqmuai5egu4imifgtcttl3wfebmzufo07hit+ztjeqy0hyrio3azdj3ner/3+c2e9enytirj/pjaavsusjyevn+xmqojjed5lnyiffvcajpfxxa5vj4izhgvapu7cz/50tnt3sm6t7y0mtgiafm9cjr+tkmooz8wgksnkda2x/zvourg1bo0uimavc60lkh6kyszwr6tazvh7oaxxwx1plsl+euubgnk3iyfhvfp/3yluiavwzv2bfl2ad2yj6zczzmkv1kv9hv9ic6jn5eec1uqqnom/ok7vhu/anbddus</latexit> <latexit sha1_base64="pj1sknjpylgska5ho/pqo4wbofo=">aaacvhiczvfba9swffa8w5vd0u2xl2ih0meidihst6pbxvbs0chsfupgzpk4edhfsmclmfdv6a/p68b+zetug216qnlhd3sux15j4tco//sibw8fpx6ys9t/+uz5i5edvvenztsww5qbaex5zhxiowgkaiwcvxayyiwc5csvrf/seqwtrv/adquzxezaliizdfq2+jqqhou89ksmezr1qtenw5n035qdndeycgsvhp/iapa1mf8x8joeng+zwtaexxuj90hsgshp7ctb6/g0mlxwojfl5txfelc488yi4bkaflo7qbhfsjlcbkizajfzm1kbogpmqutjw9fin+ztcm+uaxsop9s23bavjf/7rredugozuq36wh6yeagrgkhzm/jllska2q6tfsicr7kogherwgsul5hlhmps72rqfsre2vqgmxvvtu90rxkwuith5dye8iwaqeppiia1hau52nvmuyxuaidmhhz91afyu+tqyb+z1tfdk1zbcnf4sdz0gyzjtgj3welknmtj5pvh8ohzj9ao2sdvyafjyhtyrl6sezilnfyra/kl/i4+rkw0jntn16jxxbwmdyy6/au7g96/</latexit> <latexit sha1_base64="pj1sknjpylgska5ho/pqo4wbofo=">aaacvhiczvfba9swffa8w5vd0u2xl2ih0meidihst6pbxvbs0chsfupgzpk4edhfsmclmfdv6a/p68b+zetug216qnlhd3sux15j4tco//sibw8fpx6ys9t/+uz5i5edvvenztsww5qbaex5zhxiowgkaiwcvxayyiwc5csvrf/seqwtrv/adquzxezaliizdfq2+jqqhou89ksmezr1qtenw5n035qdndeycgsvhp/iapa1mf8x8joeng+zwtaexxuj90hsgshp7ctb6/g0mlxwojfl5txfelc488yi4bkaflo7qbhfsjlcbkizajfzm1kbogpmqutjw9fin+ztcm+uaxsop9s23bavjf/7rredugozuq36wh6yeagrgkhzm/jllska2q6tfsicr7kogherwgsul5hlhmps72rqfsre2vqgmxvvtu90rxkwuith5dye8iwaqeppiia1hau52nvmuyxuaidmhhz91afyu+tqyb+z1tfdk1zbcnf4sdz0gyzjtgj3welknmtj5pvh8ohzj9ao2sdvyafjyhtyrl6sezilnfyra/kl/i4+rkw0jntn16jxxbwmdyy6/au7g96/</latexit> <latexit sha1_base64="pj1sknjpylgska5ho/pqo4wbofo=">aaacvhiczvfba9swffa8w5vd0u2xl2ih0meidihst6pbxvbs0chsfupgzpk4edhfsmclmfdv6a/p68b+zetug216qnlhd3sux15j4tco//sibw8fpx6ys9t/+uz5i5edvvenztsww5qbaex5zhxiowgkaiwcvxayyiwc5csvrf/seqwtrv/adquzxezaliizdfq2+jqqhou89ksmezr1qtenw5n035qdndeycgsvhp/iapa1mf8x8joeng+zwtaexxuj90hsgshp7ctb6/g0mlxwojfl5txfelc488yi4bkaflo7qbhfsjlcbkizajfzm1kbogpmqutjw9fin+ztcm+uaxsop9s23bavjf/7rredugozuq36wh6yeagrgkhzm/jllska2q6tfsicr7kogherwgsul5hlhmps72rqfsre2vqgmxvvtu90rxkwuith5dye8iwaqeppiia1hau52nvmuyxuaidmhhz91afyu+tqyb+z1tfdk1zbcnf4sdz0gyzjtgj3welknmtj5pvh8ohzj9ao2sdvyafjyhtyrl6sezilnfyra/kl/i4+rkw0jntn16jxxbwmdyy6/au7g96/</latexit> <latexit sha1_base64="pj1sknjpylgska5ho/pqo4wbofo=">aaacvhiczvfba9swffa8w5vd0u2xl2ih0meidihst6pbxvbs0chsfupgzpk4edhfsmclmfdv6a/p68b+zetug216qnlhd3sux15j4tco//sibw8fpx6ys9t/+uz5i5edvvenztsww5qbaex5zhxiowgkaiwcvxayyiwc5csvrf/seqwtrv/adquzxezaliizdfq2+jqqhou89ksmezr1qtenw5n035qdndeycgsvhp/iapa1mf8x8joeng+zwtaexxuj90hsgshp7ctb6/g0mlxwojfl5txfelc488yi4bkaflo7qbhfsjlcbkizajfzm1kbogpmqutjw9fin+ztcm+uaxsop9s23bavjf/7rredugozuq36wh6yeagrgkhzm/jllska2q6tfsicr7kogherwgsul5hlhmps72rqfsre2vqgmxvvtu90rxkwuith5dye8iwaqeppiia1hau52nvmuyxuaidmhhz91afyu+tqyb+z1tfdk1zbcnf4sdz0gyzjtgj3welknmtj5pvh8ohzj9ao2sdvyafjyhtyrl6sezilnfyra/kl/i4+rkw0jntn16jxxbwmdyy6/au7g96/</latexit> <latexit sha1_base64="q0tqh/3hsvrnru+t0poyhp29hwo=">aaacqniczvhbbhmxehwwwwm3fb55sygifakkuxespfbwwkukgkhbli0ir3c2serlyh6jbgs/g6/hft6cv8gbrlcbjmtr6ixnxmdouuvhme3/9pi7d+/df3dwsp/o8zonzwahz8+c8zbdjbtp7exbheihyyycjvzufpgqjjwxlx/b/pl3se4y/rw3nswuw2lrcc4wusvbm1wxxbdv+nesbc2dudrxhdmeqzfyhowfkaxbrkagvbbn6+vgmi7txddbiovakhrxujzs8bw03cvqycvzbp6lns4csyi4hkafewc145dsbfminvpgfmgnrkgjyjs0mjyejxthxq8itln2e/flq8pt51ryf250pymbtqpbm4/v+0uquvyiml+nr7ykagi7pfokcxzlnglgrygkkf8zyzjgfd/o1pprhlsv482natu77vubfsooxq5mlf+rctsujmhcs1yqo92padefcu2qydijnztabryog8kubaymrdhee10tjmntj7zk+ybcbmetczaos89vhycfoomoyevyihyrjlwjj+qtosuzwslp8ov8jn+s4+rl8i2zxz1nel3nc3ijkvifcwfwoq==</latexit> <latexit sha1_base64="q0tqh/3hsvrnru+t0poyhp29hwo=">aaacqniczvhbbhmxehwwwwm3fb55sygifakkuxespfbwwkukgkhbli0ir3c2serlyh6jbgs/g6/hft6cv8gbrlcbjmtr6ixnxmdouuvhme3/9pi7d+/df3dwsp/o8zonzwahz8+c8zbdjbtp7exbheihyyycjvzufpgqjjwxlx/b/pl3se4y/rw3nswuw2lrcc4wusvbm1wxxbdv+nesbc2dudrxhdmeqzfyhowfkaxbrkagvbbn6+vgmi7txddbiovakhrxujzs8bw03cvqycvzbp6lns4csyi4hkafewc145dsbfminvpgfmgnrkgjyjs0mjyejxthxq8itln2e/flq8pt51ryf250pymbtqpbm4/v+0uquvyiml+nr7ykagi7pfokcxzlnglgrygkkf8zyzjgfd/o1pprhlsv482natu77vubfsooxq5mlf+rctsujmhcs1yqo92padefcu2qydijnztabryog8kubaymrdhee10tjmntj7zk+ybcbmetczaos89vhycfoomoyevyihyrjlwjj+qtosuzwslp8ov8jn+s4+rl8i2zxz1nel3nc3ijkvifcwfwoq==</latexit> <latexit sha1_base64="q0tqh/3hsvrnru+t0poyhp29hwo=">aaacqniczvhbbhmxehwwwwm3fb55sygifakkuxespfbwwkukgkhbli0ir3c2serlyh6jbgs/g6/hft6cv8gbrlcbjmtr6ixnxmdouuvhme3/9pi7d+/df3dwsp/o8zonzwahz8+c8zbdjbtp7exbheihyyycjvzufpgqjjwxlx/b/pl3se4y/rw3nswuw2lrcc4wusvbm1wxxbdv+nesbc2dudrxhdmeqzfyhowfkaxbrkagvbbn6+vgmi7txddbiovakhrxujzs8bw03cvqycvzbp6lns4csyi4hkafewc145dsbfminvpgfmgnrkgjyjs0mjyejxthxq8itln2e/flq8pt51ryf250pymbtqpbm4/v+0uquvyiml+nr7ykagi7pfokcxzlnglgrygkkf8zyzjgfd/o1pprhlsv482natu77vubfsooxq5mlf+rctsujmhcs1yqo92padefcu2qydijnztabryog8kubaymrdhee10tjmntj7zk+ybcbmetczaos89vhycfoomoyevyihyrjlwjj+qtosuzwslp8ov8jn+s4+rl8i2zxz1nel3nc3ijkvifcwfwoq==</latexit> <latexit sha1_base64="q0tqh/3hsvrnru+t0poyhp29hwo=">aaacqniczvhbbhmxehwwwwm3fb55sygifakkuxespfbwwkukgkhbli0ir3c2serlyh6jbgs/g6/hft6cv8gbrlcbjmtr6ixnxmdouuvhme3/9pi7d+/df3dwsp/o8zonzwahz8+c8zbdjbtp7exbheihyyycjvzufpgqjjwxlx/b/pl3se4y/rw3nswuw2lrcc4wusvbm1wxxbdv+nesbc2dudrxhdmeqzfyhowfkaxbrkagvbbn6+vgmi7txddbiovakhrxujzs8bw03cvqycvzbp6lns4csyi4hkafewc145dsbfminvpgfmgnrkgjyjs0mjyejxthxq8itln2e/flq8pt51ryf250pymbtqpbm4/v+0uquvyiml+nr7ykagi7pfokcxzlnglgrygkkf8zyzjgfd/o1pprhlsv482natu77vubfsooxq5mlf+rctsujmhcs1yqo92padefcu2qydijnztabryog8kubaymrdhee10tjmntj7zk+ybcbmetczaos89vhycfoomoyevyihyrjlwjj+qtosuzwslp8ov8jn+s4+rl8i2zxz1nel3nc3ijkvifcwfwoq==</latexit> <latexit sha1_base64="0vlamx/teiqhhuxpm4hhxdsubs8=">aaacxhiczvhbattaef2rt9s9oe1jx5yaqwrgscbqpoa2ljysskfoapexq9xkxrixstmqnkl9px5nh/rs/kphjiijmydt0czojm45wwk0ybz/7kv37t67/2dvyf/r4ydpnw32n5+cr4jum+mnd+ezagw0uzpuanr5gzswmvfn2ex7tn72xqxq3n3dtanmviydlrqusknf4hoaezpdxtjrp6vueara8ps+1/uhhzqdtahc1qkw5uo09zdmnoyeybybfr0ydonjva1+gyqdgliuthb7pumjzgwvq2keweuslzivruatjwr6aqwqfpjslnufqsesgnm95dzwewvyxvhaj0o+zv7vqiwflhbdtajxsftrk/9ro+tfxlctywc/fm/ntxzlhcrjq/vfzth63srkcx2urlmhigtqxidllscjkms/mal1kh+hytbbetuyb1fztawve3mz9ns+slpvcd7ijasjblz7y95n4dobckmz+ikoll5ytkg+futjgugvyxho6mnc9mmwznee2+b0oknisfl1chj0rjnoj71kr9gbs9gbdsq+srm2y5l9zl/yh/y3+hizcklq6mru63pesbsr/fghrm7hsg==</latexit> <latexit sha1_base64="0vlamx/teiqhhuxpm4hhxdsubs8=">aaacxhiczvhbattaef2rt9s9oe1jx5yaqwrgscbqpoa2ljysskfoapexq9xkxrixstmqnkl9px5nh/rs/kphjiijmydt0czojm45wwk0ybz/7kv37t67/2dvyf/r4ydpnw32n5+cr4jum+mnd+ezagw0uzpuanr5gzswmvfn2ex7tn72xqxq3n3dtanmviydlrqusknf4hoaezpdxtjrp6vueara8ps+1/uhhzqdtahc1qkw5uo09zdmnoyeybybfr0ydonjva1+gyqdgliuthb7pumjzgwvq2keweuslzivruatjwr6aqwqfpjslnufqsesgnm95dzwewvyxvhaj0o+zv7vqiwflhbdtajxsftrk/9ro+tfxlctywc/fm/ntxzlhcrjq/vfzth63srkcx2urlmhigtqxidllscjkms/mal1kh+hytbbetuyb1fztawve3mz9ns+slpvcd7ijasjblz7y95n4dobckmz+ikoll5ytkg+futjgugvyxho6mnc9mmwznee2+b0oknisfl1chj0rjnoj71kr9gbs9gbdsq+srm2y5l9zl/yh/y3+hizcklq6mru63pesbsr/fghrm7hsg==</latexit> <latexit sha1_base64="0vlamx/teiqhhuxpm4hhxdsubs8=">aaacxhiczvhbattaef2rt9s9oe1jx5yaqwrgscbqpoa2ljysskfoapexq9xkxrixstmqnkl9px5nh/rs/kphjiijmydt0czojm45wwk0ybz/7kv37t67/2dvyf/r4ydpnw32n5+cr4jum+mnd+ezagw0uzpuanr5gzswmvfn2ex7tn72xqxq3n3dtanmviydlrqusknf4hoaezpdxtjrp6vueara8ps+1/uhhzqdtahc1qkw5uo09zdmnoyeybybfr0ydonjva1+gyqdgliuthb7pumjzgwvq2keweuslzivruatjwr6aqwqfpjslnufqsesgnm95dzwewvyxvhaj0o+zv7vqiwflhbdtajxsftrk/9ro+tfxlctywc/fm/ntxzlhcrjq/vfzth63srkcx2urlmhigtqxidllscjkms/mal1kh+hytbbetuyb1fztawve3mz9ns+slpvcd7ijasjblz7y95n4dobckmz+ikoll5ytkg+futjgugvyxho6mnc9mmwznee2+b0oknisfl1chj0rjnoj71kr9gbs9gbdsq+srm2y5l9zl/yh/y3+hizcklq6mru63pesbsr/fghrm7hsg==</latexit> <latexit sha1_base64="0vlamx/teiqhhuxpm4hhxdsubs8=">aaacxhiczvhbattaef2rt9s9oe1jx5yaqwrgscbqpoa2ljysskfoapexq9xkxrixstmqnkl9px5nh/rs/kphjiijmydt0czojm45wwk0ybz/7kv37t67/2dvyf/r4ydpnw32n5+cr4jum+mnd+ezagw0uzpuanr5gzswmvfn2ex7tn72xqxq3n3dtanmviydlrqusknf4hoaezpdxtjrp6vueara8ps+1/uhhzqdtahc1qkw5uo09zdmnoyeybybfr0ydonjva1+gyqdgliuthb7pumjzgwvq2keweuslzivruatjwr6aqwqfpjslnufqsesgnm95dzwewvyxvhaj0o+zv7vqiwflhbdtajxsftrk/9ro+tfxlctywc/fm/ntxzlhcrjq/vfzth63srkcx2urlmhigtqxidllscjkms/mal1kh+hytbbetuyb1fztawve3mz9ns+slpvcd7ijasjblz7y95n4dobckmz+ikoll5ytkg+futjgugvyxho6mnc9mmwznee2+b0oknisfl1chj0rjnoj71kr9gbs9gbdsq+srm2y5l9zl/yh/y3+hizcklq6mru63pesbsr/fghrm7hsg==</latexit> Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior (Dirichlet) parameter prior (Gauss-IW) Dir( K,..., K ) z i x i k K z i Mult( ) k NIW( ) x i N ( zi ) N!9

10 Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior (Dirichlet) parameter prior (Gauss-IW) An equivalent formulation of this model is this: 1.Sample, k from priors 2.Sample params from a discrete dist. G i i k K 3.Sample data point x i x i N!10

11 Clustering using Mixture Models What is the difference in that model? there is one parameter i for each observation x i intuitively: we first sample the location of the cluster and then the data that corresponds to it In general, we use the notation: Dir( K 1) Base distribution k H( ) i G(, k ) G(, k )= KX k=1 where k ( k, i ) However: We need to know K!11 i x i N k K

12 The Dirichlet Process So far, we assumed that K is known To extend that to infinity, we use a trick: Definition: A Dirichlet process (DP) is a distribution over probability measures G, i.e. Z G( ) 0 and. If for any partition it holds: G( )d =1 (T 1,...,T K ) (G(T 1 ),...,G(T K )) Dir( H(T 1 ),..., H(T K )) then G is sampled from a Dirichlet process. Notation: G DP(,H) where and H is the concentration parameter is the base measure!12

13 Intuitive Interpretation Every sample from a Dirichlet distribution is a vector of K positive values that sum up to 1, i.e. the sample itself is a finite distribution Accordingly, a sample from a Dirichlet process is an infinite (but still discrete!) distribution Base distribution (here Gaussian) Infinitely many samples (sum up to 1)!13

14 Construction of a Dirichlet Process The Dirichlet process is only defined implicitly, i.e. we can test whether a given probability measure is sampled from a DP, but we can not yet construct one. A DP can be constructed using the stickbreaking analogy: imagine a stick of length 1 we select a random number β between 0 and 1 from a Beta-distribution we break the stick at π = β * length-of-stick we repeat this infinitely often!14

15 The Stick-Breaking Construction β 1 π 1 π β 2 2 β3 1 β 1 1 β 2 1 β3 π 3 π 4 β 4 β β α = 2 α = α = 2 α = 5 π formally, we have ky 1 k Beta(1, ) k = k now we define G( ) = l=1 (1 l) = k (1 k 1 X l=1 l ) 1X k ( k, ) k H then: G DP(,H) k=1!15

16 The Chinese Restaurant Process... Consider a restaurant with infinitely many tables Everytime a new customer comes in, he sits at an occupied table with probability proportional to the number of people sitting at that table, but he may choose to sit on a new table with decreasing probability as more customers enter the room.!16

17 The Chinese Restaurant Process It can be shown that the probability for a new customer is p( N+1 = 1:N,,H)= 1 + N H( )+ K X k=1 N k ( k, )! This means that currently occupied tables are more likely to get new customers (rich get richer) The number of occupied tables grows logarithmically with the number of customers!17

18 The DP for Mixture Modeling Using the stick-breaking construction, we see that we can extend the mixture model clustering to the situation where K goes to infinity The algorithm can be implemented using Gibbs sampling iter# 50 iter# !18

19 Questions What if the clusters can not be approximated well by Gaussians? Can we formulate an algorithm that only relies on pairwise similarities? One example for such an algorithm is Spectral Clustering!19

20 Spectral Clustering Consider an undirected graph that connects all data points The edge weights are the similarities ( closeness ) We define the weighted degree sum of all outgoing edges d i of a node as the W = d i = NX w ij j=1 d 1d2d3 D = d 4!20

21 Spectral Clustering The Graph Laplacian is defined as: L = D W This matrix has the following properties: the 1 vector is eigenvector with eigenvalue 0!21

22 Spectral Clustering The Graph Laplacian is defined as: L = D W This matrix has the following properties: the 1 vector is eigenvector with eigenvector 0 the matrix is symmetric and positive semi-definite!22

23 Spectral Clustering The Graph Laplacian is defined as: L = D W This matrix has the following properties: the 1 vector is eigenvector with eigenvector 0 the matrix is symmetric and positive semi-definite With these properties we can show: Theorem: The set of eigenvectors of L with eigenvalue 0 is spanned by the indicator vectors 1 A1,...,1 AK A k, where are the K connected components of the graph.!23

24 The Algorithm Input: Similarity matrix W Compute L = D - W Compute the eigenvectors that correspond to the K smallest eigenvalues Stack these vectors as columns in a matrix U Treat each row of U as a K-dim data point Cluster the N rows with K-means clustering The indices of the rows that correspond to the resulting clusters are those of the original data points.!24

25 An Example 5 k means clustering 5 spectral clustering y 0 y x x Spectral clustering can handle complex problems such as this one The complexity of the algorithm is O(N ), because it has to solve an eigenvector problem But there are efficient variants of the algorithm 3!25

26 Further Remarks To account for nodes that are highly connected, we can use a normalized version of the graph Laplacian Two different methods exist: L rw = D 1 L = I L sym = D 1 2 LD 1 2 = I D 1 2 WD 1 2 These have similar eigenspaces than the original Laplacian L Clustering results tend to be better than with the unnormalized Laplacian The number of clusters K can be found using the eigen-gap heuristic D 1 W!26

27 Eigen-Gap Heuristic Compute all eigen values of the graph Laplacian Sort them in increasing order Usually, there is a big jump between two consecutive eigen values The corresponding number K is a good choice for the estimated number of clusters!27

28 Hierarchical Clustering Often, we want to have nested clusters instead of a flat clustering Two possible methods: bottom-up or agglomerative clustering top-down or divisive clustering Both methods take a dissimilarity matrix as input Bottom-up grows merges points to clusters Top-down splits clusters into sub-clusters Both are heuristics, there is no clear objective function They always produce a clustering (also for noise)!28

29 Agglomerative Clustering Start with N clusters, each contains exactly one data point At each step, merge the two most similar groups Repeat until there is a single group Dendrogram !29

30 Linkage In agglomerative clustering, it is important to define a distance measure between two clusters There are three different methods: Single linkage: considers the two closest elements from both clusters and uses their distance Complete linkage: considers the two farthest elements from both clusters Average linkage: uses the average distance between pairs of points from both clusters Depending on the application, one linkage should be preferred over the other!30

31 Single Linkage The distance is based on The resulting dendrogram is a minimum spanning tree, i.e. it minimizes the sum of the edge weights Thus: we can compute the clustering in O(N ) time single link d SL (G, H) = min i2g,i 0 2H d i,i !31

32 Complete Linkage The distance is based on d CL (G, H) = Complete linkage fulfills the compactness max i2g,i 0 2H d i,i 0 property, i.e. all points in a group should be similar to each other Tends to produce clusters with smaller diameter 2 complete link !32

33 Average Linkage The distance is based on Is a good compromise between single and complete linkage d avg (G, H) = 1 X n G n H i2g X i 0 2H d i,i 0 However: sensitive to changes on the meas. scale average link !33

34 Divisive Clustering Start with all data in a single cluster Recursively divide each cluster into two child clusters Problem: optimal split is hard to find Idea: use the cluster with the largest diameter and use K-means with K = 2 Or: use minimum-spanning tree and cut links with the largest dissimilarity In general two advantages: Can be faster More globally informed (not myopic as bottom-up)!34

35 Choosing the Number of Clusters As in general, choosing the number of clusters is hard When a dendrogram is available, a gap can be detected in the lengths of the links This represents the dissimilarity between merged groups However: in real data this can be hard to detect There are Bayesian techniques to address this problem (Bayesian hierarchical clustering)!35

36 Evaluation of Clustering Algorithms Clustering is unsupervised: evaluation of the output is hard, because no ground truth is given Intuitively, points in a cluster should be similar and points in different clusters dissimilar However, better methods use external information, such as labels or a reference clustering Then we can compare clusterings with the labels using different metrics, e.g. purity mutual information!36

37 Purity Define N ij the number of objects in cluster i that are in class j Define p ij = N ij N i = Purity N i overall purity CX j=1 N ij p i = max X number of objects in cluster i Purity ranges from 0 (bad) to 1 (good) But: a clustering with each object in its own cluster has a purity of 1 i j N i N p i p ij Purity = 0.71!37

38 Mutual Information Let U and V be two clusterings Define the probability that a randomly chosen point belongs to cluster in U and to in V Also: The prob. that a point is in I(U, V )= This can be normalized to account for many small clusters with low entropy u i p UV (i, j) = u i \ v j RX i=1 CX j=1 N v j u i p U (i) = u i p UV (i, j) log p UV (i, j) p U (i)p V (j) N!38

39 Summary Several Clustering methods exist: K-means clustering and Expectation-Maximization, both based on Gaussian Mixture Models K-means uses hard assignments, whereas EM uses soft assignments and estimates also the covariances The Dirichlet Process is a non-parametric model to perform clustering without specifying K Spectral clustering uses the graph Laplacian and performs an eigenvector analysis Major Problem: most clustering algorithms require the number of clusters to be given!39

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

Computer Vision Group Prof. Daniel Cremers. 14. Clustering Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Machine Learning. Clustering 1. Hamid Beigy. Sharif University of Technology. Fall 1395

Machine Learning. Clustering 1. Hamid Beigy. Sharif University of Technology. Fall 1395 Machine Learning Clustering 1 Hamid Beigy Sharif University of Technology Fall 1395 1 Some slides are taken from P. Rai slides Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Clustering and Gaussian Mixture Models

Clustering and Gaussian Mixture Models Clustering and Gaussian Mixture Models Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 25, 2016 Probabilistic Machine Learning (CS772A) Clustering and Gaussian Mixture Models 1 Recap

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Bayesian Nonparametrics: Models Based on the Dirichlet Process

Bayesian Nonparametrics: Models Based on the Dirichlet Process Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

CSE446: Clustering and EM Spring 2017

CSE446: Clustering and EM Spring 2017 CSE446: Clustering and EM Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin, Dan Klein, and Luke Zettlemoyer Clustering systems: Unsupervised learning Clustering Detect patterns in unlabeled

More information

Non-parametric Clustering with Dirichlet Processes

Non-parametric Clustering with Dirichlet Processes Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Collapsed Gibbs Sampler for Dirichlet Process Gaussian Mixture Models (DPGMM)

Collapsed Gibbs Sampler for Dirichlet Process Gaussian Mixture Models (DPGMM) Collapsed Gibbs Sampler for Dirichlet Process Gaussian Mixture Models (DPGMM) Rajarshi Das Language Technologies Institute School of Computer Science Carnegie Mellon University rajarshd@cs.cmu.edu Sunday

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Bayesian non parametric approaches: an introduction

Bayesian non parametric approaches: an introduction Introduction Latent class models Latent feature models Conclusion & Perspectives Bayesian non parametric approaches: an introduction Pierre CHAINAIS Bordeaux - nov. 2012 Trajectory 1 Bayesian non parametric

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007 COS 597C: Bayesian Nonparametrics Lecturer: David Blei Lecture # Scribes: Jordan Boyd-Graber and Francisco Pereira October, 7 Gibbs Sampling with a DP First, let s recapitulate the model that we re using.

More information

Machine Learning - MT Clustering

Machine Learning - MT Clustering Machine Learning - MT 2016 15. Clustering Varun Kanade University of Oxford November 28, 2016 Announcements No new practical this week All practicals must be signed off in sessions this week Firm Deadline:

More information

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models Advanced Machine Learning Lecture 10 Mixture Models II 30.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ Announcement Exercise sheet 2 online Sampling Rejection Sampling Importance

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:

More information

Latent Variable Models and EM Algorithm

Latent Variable Models and EM Algorithm SC4/SM8 Advanced Topics in Statistical Machine Learning Latent Variable Models and EM Algorithm Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/atsml/

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Stochastic Processes, Kernel Regression, Infinite Mixture Models

Stochastic Processes, Kernel Regression, Infinite Mixture Models Stochastic Processes, Kernel Regression, Infinite Mixture Models Gabriel Huang (TA for Simon Lacoste-Julien) IFT 6269 : Probabilistic Graphical Models - Fall 2018 Stochastic Process = Random Function 2

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado CSCI 5822 Probabilistic Model of Human and Machine Learning Mike Mozer University of Colorado Topics Language modeling Hierarchical processes Pitman-Yor processes Based on work of Teh (2006), A hierarchical

More information

Spatial Bayesian Nonparametrics for Natural Image Segmentation

Spatial Bayesian Nonparametrics for Natural Image Segmentation Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes

More information

MATH 829: Introduction to Data Mining and Analysis Clustering II

MATH 829: Introduction to Data Mining and Analysis Clustering II his lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 829: Introduction to Data Mining and Analysis Clustering II Dominique Guillot Departments

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

Hierarchical Bayesian Nonparametrics

Hierarchical Bayesian Nonparametrics Hierarchical Bayesian Nonparametrics Micha Elsner April 11, 2013 2 For next time We ll tackle a paper: Green, de Marneffe, Bauer and Manning: Multiword Expression Identification with Tree Substitution

More information

Finite Singular Multivariate Gaussian Mixture

Finite Singular Multivariate Gaussian Mixture 21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu November 3, 2015 Methods to Learn Matrix Data Text Data Set Data Sequence Data Time Series Graph

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Bayesian Inference for Dirichlet-Multinomials

Bayesian Inference for Dirichlet-Multinomials Bayesian Inference for Dirichlet-Multinomials Mark Johnson Macquarie University Sydney, Australia MLSS Summer School 1 / 50 Random variables and distributed according to notation A probability distribution

More information

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 22 Dan Weld Learning Gaussians Naïve Bayes Last Time Gaussians Naïve Bayes Logistic Regression Today Some slides from Carlos Guestrin, Luke Zettlemoyer

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 6: Cluster Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering

More information

ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning ECE 5984: Introduction to Machine Learning Topics: (Finish) Expectation Maximization Principal Component Analysis (PCA) Readings: Barber 15.1-15.4 Dhruv Batra Virginia Tech Administrativia Poster Presentation:

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543 K-Means, Expectation Maximization and Segmentation D.A. Forsyth, CS543 K-Means Choose a fixed number of clusters Choose cluster centers and point-cluster allocations to minimize error can t do this by

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu October 19, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Gaussian Mixture Models, Expectation Maximization

Gaussian Mixture Models, Expectation Maximization Gaussian Mixture Models, Expectation Maximization Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Andrew Moore (CMU), Eric Eaton (UPenn), David Kauchak

More information

COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017

COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SOFT CLUSTERING VS HARD CLUSTERING

More information

Hierarchical Bayesian Languge Model Based on Pitman-Yor Processes. Yee Whye Teh

Hierarchical Bayesian Languge Model Based on Pitman-Yor Processes. Yee Whye Teh Hierarchical Bayesian Languge Model Based on Pitman-Yor Processes Yee Whye Teh Probabilistic model of language n-gram model Utility i-1 P(word i word i-n+1 ) Typically, trigram model (n=3) e.g., speech,

More information

More Spectral Clustering and an Introduction to Conjugacy

More Spectral Clustering and an Introduction to Conjugacy CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

28 : Approximate Inference - Distributed MCMC

28 : Approximate Inference - Distributed MCMC 10-708: Probabilistic Graphical Models, Spring 2015 28 : Approximate Inference - Distributed MCMC Lecturer: Avinava Dubey Scribes: Hakim Sidahmed, Aman Gupta 1 Introduction For many interesting problems,

More information

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes MAD-Bayes: MAP-based Asymptotic Derivations from Bayes Tamara Broderick Brian Kulis Michael I. Jordan Cat Clusters Mouse clusters Dog 1 Cat Clusters Dog Mouse Lizard Sheep Picture 1 Picture 2 Picture 3

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

Statistics 202: Data Mining. c Jonathan Taylor. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Model-based clustering General approach Choose a type of mixture model (e.g. multivariate Normal)

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information