Supplementary Material: Weakly Supervised Learning of Heterogeneous Concepts in Videos
|
|
- Laureen Cain
- 6 years ago
- Views:
Transcription
1 Supplementary Material: Weakly Supervised Learning of Heterogeneous Concepts in Videos Sohil Shah, Kuldeep Kulkarni, Ariit Biswas, Ankit Gandhi, Om Deshmukh, and Larry S. Davis Expectation Constraints In an Bayesian framework, the effective constraints for equations - are defined as an expectation [,] of the original constraints and can be rewritten as, i... M, [ ] E q z i s zi a ξ i s,a, s, a Γ i S = [ E q z i s = [ E q z i a = ] ] ξ i s,, s, Γ i S ξ i,a,, a Γ i S i... M and... N i, [ ] E q z i s =, if s, Γ i and s, a Γ i, a A S [ ] E q z i =, if, a Γ i and s, a Γ i, s S S a where the expectation is taken w.r.t. the posterior distribution in. From one may note that through π a i, the samples of z i a depends on the previously sampled latent coefficients such as z i s. This complicates the applicability of constraint in equation S. However due to the independency assumption, the search space over the family of tractable posterior distribution in simplifies the constraint in equation S-S to, i... M, = ν i s = N i = ν i s νi a ξi s,a, s, a Γ i S6 ξi s,, s, Γ i S ν i a ξi,a,, a Γ i S i... M and... N i, ν i s =, if s, Γ i and s, a Γ i, a A S ν i a =, if, a Γ i and s, a Γ i, s S S
2 Sohil Shah et al. Derivation of Posterior Update Equations Now, note that the constraints in S6-S can be rewritten as hinge loss function and added as part of the obective function in equation. Hence the final formulation is given by, min ν i,τ i, Φ k,σ k KL wy ΨY Θ i= M + C + J Γ i J=,a J Γ i J=s,a N M i i= = N i max, = = N i max, ν i a ν i s νi a e {s,a} s.t. i... M, and... N i, S, S log p X ei Y, Θ wydy + J Γ i J=s, The obective function in eq. S can be rewritten as, M Lν i, τ i, Φ k, σk = L L i C i= = N i max, K a+k s k= H i = ν i s S S where L represent KL-divergence term, L i denote the likelihood term and H is the term corresponding to hinge loss function for ν i. Expanding L i, we get, [ ] L i E w log px s i Y, Θ + log pxa i Y, Θ S = xs i T x si E w [z i. As ]x s i + E w [z i. Us z i T. ] σ ns Ds logπσ ns xa i T x ai E w [z i. Aa ]x a i + E w [z i. Ua z i T. ] σna Da logπσna S where U = E w [A A T ] is K max K max matrix, U = Φ.Φ T k.; E w [z i k νi Φ k. x i ; and E w [z i n. U z i T n. ] = <k For KL-divergence term, we get KL KL wz ΨZ Θ + KL ν i n νi nk U + k wy ΨY Θ wa s ΨA s Θ ν i nk D σk + Φ k.φ T k. + KL = KL. A ]x i = S wv Ψv Θ wa a ΨA a Θ +, where
3 Supplementary: Weakly Supervised Learning of Heterogeneous Concepts in Videos the individual terms are, KL M i= wv Ψv Θ k= τ i k = i i αψτ k Ψτ k + τ i k + τ i k i i Γ τ k Γ τ k log Γ τ i k + τ i k K max log α KL wz ΨZ Θ = M K max K max Ψτ i i k Ψτ k + τ i k i= = k= +ν i log νi ν i + νi = KL wa ΨA Θ = log νi K max k= D σ k + Φ kφ T k σ A i i Ψτ k Ψτ k + τ i k νi E w[log S6 k v i ] = S D + log σ k σa S where Ψ. is the digamma function. As shown for original IBP in [], the term E w [log k = vi ] is approximated by its lower bound, E w [log k v i ] = k m= m= k q km Ψτ i m + n=m m= k n=m+ q kn k k q kn Ψτ i m + τ i m + Hq k. Ψτ i m S = L k where the variational parameter q k. = q k... q kk is k-point probability mass function and Hq k. denotes entropy of q k.. The tightest upper bound is obtained by setting, q km = Z k exp Ψτ i m m + n= Ψτ i n m n= Ψτ i n + τ i n where Z k is the normalization factor to enable q k. to be a distribution. On replacing the term E w [log k = vi ] with its lower bound L k, we have an upper bound for KL wy ΨY Θ. On substituting equation S-S in S, the optimum value for parameters of mean-field variational approximate posterior distribution are obtained by setting the derivative of S w.r.t. those parameters to zero and simultaneously solving for
4 Sohil Shah et al. all parameters using KKT conditions. We derive the following equations which are iteratively solved, σke = σae + M σne ν i, e {s, a} S i= = Φ e k = M σne ν i x e i ν i l Φe l σke, e {s, a} S i= = l:l k m N i S τ i k = α + τ i m=k = ν i m + k = + N i m=k m=k+ ν i m = q i mk ν i m = s=k+ q i ms S Above equations are somewhat similar to those given by variational approximation on IBP []. The update equation for ν differs completely and it is given by, ν i = ζ i = L i k + e ζi k t= Ψτ i t Ψτ i t + τ i t L k σna D a σka + Φ a kφ at k + σns + σ na Φ a k + C J Γ i J=s,k x i l k ν i l Φa l T Φ s k σ ns x i + C J Γ i J=k,a D s σks + Φ s kφ st k l k ν i l Φs l T I { Ni ν i l= νi lk νi la <} a I { Ni ν i l= νi ls νi lk <} s + CI{ } Ni l= νi lk <,k Ka+Ks S S where L i k and I is an indicator variable. L i k indicates whether an entity action / subect k is part of i th video label set Γ i or not. This inturn enforces ν = for all ν satisfying eq. S and eq. S. The hyperparameter σn and σa can be set apriori or estimated from the data. The empirical estimation can be easily derived by maximizing the expected log-likelihood, which is similar to maximization step of EM algorithm. The closed form solution is given by, σ A = k= D σ k + Φ kφ T k K max D S6
5 Supplementary: Weakly Supervised Learning of Heterogeneous Concepts in Videos σ n = M i= Ni = x i T x i E w [z i. A ]x i + E w [z i. U z i. T ] M i= N id S The final algorithm is summarized in Algorithm in the paper. Additional Experimental Results In this section we share some of the results which provides additional insights into experiments. Casablanca: The person class confusion matrix is shown in Figure. It exhibits that our approach learns each person appearance model with high accuracy and it can learn from as less as weakly annotated samples. 6 6 BG 6 6 BG 6 6 Fig.. Person class confusion matrix. BG denotes the background class which can represent any unknown face. AD: Some of the additional qualitative results. In Figure, red boxes represents generated proposals, green boxes represents the selected proposals using WSC-SIIBP algorithm and magenta boxes represents the groundtruth annotation. In case of overlapping boxes proposals, only the last plotted rectangular box is visible. Boxes were plotted in the following order: red first, magenta, green last. Additionally, we have attached videos alongside this supplementary material, depicting the generated proposals and their automatic selection.
6 6 Sohil Shah et al. {ball, rolling},{dog, running} {baby, walking}, {human} {human}, {bird, climbing} {dog, walking}, {human, walking}, {car} {bird, eating},{cat} Fig.. Qualitative results of weakly supervised concept localization on AD dataset using WSCSIIBP algorithm. Tags are weak paired label input for the video. References. Zhu, J., Chen, N., Xing, E.P.: Bayesian inference with posterior regularization and applications to infinite latent svms. The Journal of Machine Learning Research
7 Supplementary: Weakly Supervised Learning of Heterogeneous Concepts in Videos. Ganchev, K., Graça, J., Gillenwater, J., Taskar, B.: Posterior regularization for structured latent variable models. The Journal of Machine Learning Research. Doshi, F., Miller, K., Gael, J.V., Teh, Y.W.: Variational inference for the indian buffet process. In: International Conference on Artificial Intelligence and Statistics.,
Posterior Regularization
Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods
More informationDeep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationCOMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017
COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SOFT CLUSTERING VS HARD CLUSTERING
More informationSide Information Aware Bayesian Affinity Estimation
Side Information Aware Bayesian Affinity Estimation Aayush Sharma Joydeep Ghosh {asharma/ghosh}@ece.utexas.edu IDEAL-00-TR0 Intelligent Data Exploration and Analysis Laboratory (IDEAL) ( Web: http://www.ideal.ece.utexas.edu/
More informationA Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution
A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution Carl Scheffler First draft: September 008 Contents The Student s t Distribution The
More informationEM & Variational Bayes
EM & Variational Bayes Hanxiao Liu September 9, 2014 1 / 19 Outline 1. EM Algorithm 1.1 Introduction 1.2 Example: Mixture of vmfs 2. Variational Bayes 2.1 Introduction 2.2 Example: Bayesian Mixture of
More informationExpectation Propagation for Approximate Bayesian Inference
Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given
More informationLecture 8: Graphical models for Text
Lecture 8: Graphical models for Text 4F13: Machine Learning Joaquin Quiñonero-Candela and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationBayesian Hidden Markov Models and Extensions
Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationVariational Autoencoders
Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly
More informationGaussian Mixture Models
Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationOnline Bayesian Passive-Agressive Learning
Online Bayesian Passive-Agressive Learning International Conference on Machine Learning, 2014 Tianlin Shi Jun Zhu Tsinghua University, China 21 August 2015 Presented by: Kyle Ulrich Introduction Online
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationVariational Inference and Learning. Sargur N. Srihari
Variational Inference and Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics in Approximate Inference Task of Inference Intractability in Inference 1. Inference as Optimization 2. Expectation Maximization
More informationClustering and Gaussian Mixture Models
Clustering and Gaussian Mixture Models Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 25, 2016 Probabilistic Machine Learning (CS772A) Clustering and Gaussian Mixture Models 1 Recap
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationLatent Variable Models
Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:
More informationMixed Membership Stochastic Blockmodels
Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted
More informationCeleste: Variational inference for a generative model of astronomical images
Celeste: Variational inference for a generative model of astronomical images Jerey Regier Statistics Department UC Berkeley July 9, 2015 Joint work with Jon McAulie (UCB Statistics), Andrew Miller, Ryan
More informationInfinite Latent SVM for Classification and Multi-task Learning
Infinite Latent SVM for Classification and Multi-task Learning Jun Zhu, Ning Chen, and Eric P. Xing Dept. of Computer Science & Tech., TNList Lab, Tsinghua University, Beijing 100084, China Machine Learning
More informationSelf-Organization by Optimizing Free-Energy
Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present
More informationAnother Walkthrough of Variational Bayes. Bevan Jones Machine Learning Reading Group Macquarie University
Another Walkthrough of Variational Bayes Bevan Jones Machine Learning Reading Group Macquarie University 2 Variational Bayes? Bayes Bayes Theorem But the integral is intractable! Sampling Gibbs, Metropolis
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationThe Expectation Maximization or EM algorithm
The Expectation Maximization or EM algorithm Carl Edward Rasmussen November 15th, 2017 Carl Edward Rasmussen The EM algorithm November 15th, 2017 1 / 11 Contents notation, objective the lower bound functional,
More informationLecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models
Advanced Machine Learning Lecture 10 Mixture Models II 30.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ Announcement Exercise sheet 2 online Sampling Rejection Sampling Importance
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationPattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM
Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures
More informationAalborg Universitet. Published in: IEEE International Conference on Acoustics, Speech, and Signal Processing. Creative Commons License Unspecified
Aalborg Universitet Model-based Noise PSD Estimation from Speech in Non-stationary Noise Nielsen, Jesper jær; avalekalam, Mathew Shaji; Christensen, Mads Græsbøll; Boldt, Jesper Bünsow Published in: IEEE
More informationFully Bayesian Deep Gaussian Processes for Uncertainty Quantification
Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationG8325: Variational Bayes
G8325: Variational Bayes Vincent Dorie Columbia University Wednesday, November 2nd, 2011 bridge Variational University Bayes Press 2003. On-screen viewing permitted. Printing not permitted. http://www.c
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationVariational Mixture of Gaussians. Sargur Srihari
Variational Mixture of Gaussians Sargur srihari@cedar.buffalo.edu 1 Objective Apply variational inference machinery to Gaussian Mixture Models Demonstrates how Bayesian treatment elegantly resolves difficulties
More informationStochastic Complexity of Variational Bayesian Hidden Markov Models
Stochastic Complexity of Variational Bayesian Hidden Markov Models Tikara Hosino Department of Computational Intelligence and System Science, Tokyo Institute of Technology Mailbox R-5, 459 Nagatsuta, Midori-ku,
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13
More informationThe Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks
The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks Yingfei Wang, Chu Wang and Warren B. Powell Princeton University Yingfei Wang Optimal Learning Methods June 22, 2016
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationTwo Useful Bounds for Variational Inference
Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the
More informationWeek 3: The EM algorithm
Week 3: The EM algorithm Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2005 Mixtures of Gaussians Data: Y = {y 1... y N } Latent
More informationMAD-Bayes: MAP-based Asymptotic Derivations from Bayes
MAD-Bayes: MAP-based Asymptotic Derivations from Bayes Tamara Broderick Brian Kulis Michael I. Jordan Cat Clusters Mouse clusters Dog 1 Cat Clusters Dog Mouse Lizard Sheep Picture 1 Picture 2 Picture 3
More informationHidden CRFs for Human Activity Classification from RGBD Data
H Hidden s for Human from RGBD Data Avi Singh & Ankit Goyal IIT-Kanpur CS679: Machine Learning for Computer Vision April 13, 215 Overview H 1 2 3 H 4 5 6 7 Statement H Input An RGBD video with a human
More informationLogistic Regression: Online, Lazy, Kernelized, Sequential, etc.
Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010
More informationICS-E4030 Kernel Methods in Machine Learning
ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationA Background Layer Model for Object Tracking through Occlusion
A Background Layer Model for Object Tracking through Occlusion Yue Zhou and Hai Tao UC-Santa Cruz Presented by: Mei-Fang Huang CSE252C class, November 6, 2003 Overview Object tracking problems Dynamic
More informationLab 12: Structured Prediction
December 4, 2014 Lecture plan structured perceptron application: confused messages application: dependency parsing structured SVM Class review: from modelization to classification What does learning mean?
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationReinforcement Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Reinforcement Learning Introduction Introduction Unsupervised learning has no outcome (no feedback). Supervised learning has outcome so we know what to predict. Reinforcement learning is in between it
More informationMixtures of Gaussians. Sargur Srihari
Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationScaling Neighbourhood Methods
Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)
More informationAlgorithms for Variational Learning of Mixture of Gaussians
Algorithms for Variational Learning of Mixture of Gaussians Instructors: Tapani Raiko and Antti Honkela Bayes Group Adaptive Informatics Research Center 28.08.2008 Variational Bayesian Inference Mixture
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationExpectation Maximization
Expectation Maximization Bishop PRML Ch. 9 Alireza Ghane c Ghane/Mori 4 6 8 4 6 8 4 6 8 4 6 8 5 5 5 5 5 5 4 6 8 4 4 6 8 4 5 5 5 5 5 5 µ, Σ) α f Learningscale is slightly Parameters is slightly larger larger
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationCollapsed Variational Bayesian Inference for Hidden Markov Models
Collapsed Variational Bayesian Inference for Hidden Markov Models Pengyu Wang, Phil Blunsom Department of Computer Science, University of Oxford International Conference on Artificial Intelligence and
More informationNote 1: Varitional Methods for Latent Dirichlet Allocation
Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationParametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a
Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationBayesian Inference with Posterior Regularization and applications to Infinite Latent SVMs
Regularized Bayesian Inference and Infinite Latent SVMs Bayesian Inference with Posterior Regularization and applications to Infinite Latent SVMs Jun Zhu dcszj@mail.tsinghua.edu.cn Ning Chen ningchen@mail.tsinghua.edu.cn
More informationMachine Learning Basics: Maximum Likelihood Estimation
Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning
More informationAn introduction to Variational calculus in Machine Learning
n introduction to Variational calculus in Machine Learning nders Meng February 2004 1 Introduction The intention of this note is not to give a full understanding of calculus of variations since this area
More informationModeling Multiple-mode Systems with Predictive State Representations
Modeling Multiple-mode Systems with Predictive State Representations Britton Wolfe Computer Science Indiana University-Purdue University Fort Wayne wolfeb@ipfw.edu Michael R. James AI and Robotics Group
More informationNonparametric Factor Analysis with Beta Process Priors
Nonparametric Factor Analysis with Beta Process Priors John Paisley Lawrence Carin Department of Electrical & Computer Engineering Duke University, Durham, NC 7708 jwp4@ee.duke.edu lcarin@ee.duke.edu Abstract
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationSeries 7, May 22, 2018 (EM Convergence)
Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18
More informationDocument and Topic Models: plsa and LDA
Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis
More informationMore on HMMs and other sequence models. Intro to NLP - ETHZ - 18/03/2013
More on HMMs and other sequence models Intro to NLP - ETHZ - 18/03/2013 Summary Parts of speech tagging HMMs: Unsupervised parameter estimation Forward Backward algorithm Bayesian variants Discriminative
More informationLecture 8: Clustering & Mixture Models
Lecture 8: Clustering & Mixture Models C4B Machine Learning Hilary 2011 A. Zisserman K-means algorithm GMM and the EM algorithm plsa clustering K-means algorithm K-means algorithm Partition data into K
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationarxiv: v4 [stat.ml] 24 Jul 2013
Colorado Reed Zoubin Ghahramani Engineering Department, Cambridge University, Cambridge UK cr478@cam.ac.uk zoubin@eng.cam.ac.uk arxiv:1304.385v4 [stat.ml] 4 Jul 013 Abstract Inference for latent feature
More informationExpectation Maximization
Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 2 - Spring 2017 Lecture 6 Jan-Willem van de Meent (credit: Yijun Zhao, Chris Bishop, Andrew Moore, Hastie et al.) Project Project Deadlines 3 Feb: Form teams of
More informationMachine Learning Summer School
Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationSmall-variance Asymptotics for Dirichlet Process Mixtures of SVMs
Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs Yining Wang Jun Zhu Tsinghua University July, 2014 Y. Wang and J. Zhu (Tsinghua University) Max-Margin DP-means July, 2014 1 / 25 Outline
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationGenerative MaxEnt Learning for Multiclass Classification
Generative Maximum Entropy Learning for Multiclass Classification A. Dukkipati, G. Pandey, D. Ghoshdastidar, P. Koley, D. M. V. S. Sriram Dept. of Computer Science and Automation Indian Institute of Science,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 9: Expectation Maximiation (EM) Algorithm, Learning in Undirected Graphical Models Some figures courtesy
More information