Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling
|
|
- Camron Ball
- 5 years ago
- Views:
Transcription
1 Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example Collapse MoG Sampling Dir(,..., ) z i {µ, } F( ) x i z i N(x i ; µ z i, z i) Collapse sampler z i x i N Emily Fox 0
2 Example Collapse MoG Sampling Dir(,..., ) zi {µ, } F ( ) xi z i N (xi ; µzi, zi ) n n Derivation zi xi N Important facts: ( p(z:n ) = Q P Q ) (n + ) P ( ) ( n + ) (m + ) =m (m) Emily Fox 0 Latent Dirichlet Allocation (LDA) Emily Fox 0 4
3 LDA Generative Moel Observations: w,...,w N Associate topics: z,...,z N Parameters: = {{ }, { }} Generative moel: Emily Fox 0 5 LDA Generative Moel z i N D Y DY p( ) = p( ) p( ) = =! YN p(zi )p(wi zi, ) i= Emily Fox 0 6
4 Collapse LDA Sampling Marginalize parameters Document-specific topic weights Corpus-wie topic-specific wor istributions Sample topic inicators for each wor Derivation: zi wi N D p(z:n ) = (P Q Q ) (n + ) ( ) ( P n + ) p(z ) = DY p(z:n ) = p({wi zi = }, )= Q (P ) ( ) Y p(w z, )= p({wi zi = }, ) = Q (v + ) ( P v + ) Emily Fox 0 7 Collapse LDA Sampling Marginalize parameters Document-specific topic weights Corpus-wie topic-specific wor istributions Sample topic inicators for each wor Algorithm: zi wi N D Emily Fox 0 8 4
5 Sample Document Etruscan trae Emily Fox 0 9 Ranomly Assign Topics z i Etruscan trae Emily Fox 0 0 5
6 Ranomly Assign Topics z i Etruscan trae Etruscan trae Etruscan trae Etruscan Etruscan trae trae Etruscan Etruscan trae trae Etruscan Etruscan trae Etruscan trae trae Etruscan Etruscan trae Etruscan trae trae Etruscan Etruscan trae Etruscan trae trae Etruscan Etruscan trae Etruscan trae trae Etruscan Etruscan Etruscan trae trae trae trae Etruscan Etruscan trae ship trae trae Etruscan Etruscan trae trae ship Etruscan trae ship trae Etruscan trae Italy ship trae Emily Fox 0 Maintain Global Statistics z i Etruscan trae Total counts from all ocs Etruscan trae Emily Fox 0 6
7 Resample Assignments z i Etruscan trae Etruscan trae Emily Fox 0 What is the conitional istribution for this topic? z i? Etruscan trae Emily Fox 0 4 7
8 What is the conitional istribution for this topic? Part I: How much oes this ocument lie each topic? z i? Etruscan trae Topic Topic Topic Emily Fox 0 5 What is the conitional istribution for this topic? Part I: How much oes this ocument lie each topic? Part II: How much oes each topic lie this wor? z i? Etruscan trae Topic Topic Topic trae 0 7 Emily Fox 0 6 8
9 What is the conitional istribution for this topic? Part I: How much oes this ocument lie each topic? Part II: How much oes each topic lie this wor? z i? Etruscan trae Topic Topic Topic Emily Fox 0 7 What is the conitional istribution for this topic? Part I: How much oes this ocument lie each topic? Part II: How much oes each topic lie this wor? z i? Etruscan trae Topic Topic Topic n + P j= n j + j vtrae P + V j= v j + Emily Fox 0 8 j 9
10 Sample a New Topic Inicator z i? Etruscan trae Topic Topic Topic Emily Fox 0 9 Upate Counts z i? Etruscan trae Etruscan trae Emily Fox 0 0 0
11 Geometrically z i Etruscan trae Topic Topic Topic Emily Fox 0 Issues with Generic LDA Sampling Slow mixing rates à Nee many iterations Each iteration cycles through sampling topic assignments for all wors in all ocuments Moern approaches: Large-scale LDA. For example, Mimno, Davi, Matthew D. Hoffman an Davi M. Blei. "Sparse stochastic inference for latent Dirichlet allocation." International Conference on Machine Learning, 0. Distribute LDA. For example, Ahme, Amr, et al. "Scalable inference in latent variable moels." Proceeings of the fifth ACM international conference on Web search an ata mining (0): - Alternative: Variational methos instea of sampling Approximate posterior with an optimize variational istribution Emily Fox 0
12 Variational Methos Recall tas: Characterize the posterior Turn posterior inference into an optimization tas Introuce a tractable family of istributions over parameters an latent variables Family is inexe by a set of free parameters Fin member of the family closest to: Questions: How o we measure closeness? If the posterior is intractable, how can we approximate something we o not have to begin with? Emily Fox 0 A Measure of Closeness ullbac-leibler (L) ivergence Measures istance between two istributions p an q Not symmetric p etermines where the ifference is important: p(x)=0 an q(x) 0 p(x) 0 an q(x)=0 Want Just as har as the original problem! Emily Fox 0 4
13 Reverse Divergence Divergence D(q p ) true istribution p efines support of iff. the correct irection will be intractable to compute Reverse ivergence D(q p ) approximate istribution efines support tens to give overconfient results will be tractable Emily Fox 0 5 Interpretations of Minimizing Reverse L Similarity measure: Evience lower boun (ELBO) Therefore, minimizing L is equivalent to maximizing a lower boun on the marginal lielihoo: Max L(q) = min D(q p) = max lower boun of log p(x) Emily Fox 0 6
14 Mean Fiel How o we choose a Q such that the following is tractable? Simplest case = mean fiel approximation Assume each parameter an latent variable is conitionally inepenent given the set of free parameters Then, entropy term ecomposes as Emily Fox 0 7 Mean Fiel Examine one free parameter, e.g., Can rewrite joint as E q [log p(, z, x)] = E q [log p( z,x)] + E q [log p(z,x)] Loo at terms of ELBO just epening on L = Liewise, L n = This motivates using a coorinate ascent algorithm for optimization Iteratively optimize each free parameter holing all others fixe Emily Fox 0 8 4
15 Mean Fiel for LDA In LDA, our parameters are = { }, { } z = {z i } z i N D The variational istribution factorizes as The joint istribution factorizes as Y DY YN p(,, z, w) = p( ) p( ) p(zi )p(wi zi, ) = = i= Emily Fox 0 9 Mean Fiel for LDA Y DY q(,, z) = q( ) q( = = Y q(zi N ) Y DY YN p(,, z, w) = p( ) p( ) p(zi )p(wi zi, ) = = i= i= i ) z i i N D Examine the ELBO X DX L(q) = E q [log p( )] + E q [log p( )] = + = X XN E q [log p(zi )] + E q [log p(wi zi, )] = i= X E q [log q( )] DX X XN E q [log q( )] E q [log q(zi = = = i= i )] Emily Fox 0 0 5
16 Mean Fiel for LDA Let s loo at some of these terms z i i X Eq [log p(z i )] N D E q [log q(z i i )] Other terms follow similarly Emily Fox 0 Optimize via Coorinate Ascent Algorithm: z i i N D Emily Fox 0 6
17 Optimize via Coorinate Ascent Algorithm: z i i N D Emily Fox 0 Alternative Optimization Schemes Inefficient: Start from ranomly initialize (topics) Analyze whole corpus before upating again If streaming ata scenario, can t compute even one iteration! Din t have to o coor. ascent. Coul have use graient ascent. Emily Fox 0 4 7
18 Alternative Optimization Schemes Recall stochastic graient ascent: Assume M = Unbiase, but noisy Here, DX L = E q [log p( )] E q [log q( )] + E q [log p( )] E q [log q( )] DX = + E q [log p(z,x, )] E q [log q(z )] = L t = E q [log p( )] E q [log q( )]+D E q [log p( t )] E[log q( t )] +D E q [log p(z t,x t t, )] E q [log q(z t )] Emily Fox 0 5 Stochastic Variational Inference for LDA Initialize (0) ranomly. Repeat (inefinitely): Sample a ocument uniformly from the ata set. For all, initialize = Repeat until converge For i=,,n i / exp{e[log ]+E[log,w i ]} XN Set = + i= i Tae a stochastic graient step (t) = (t ) + t r L Emily Fox 0 6 8
19 Acnowlegements Thans to Dave Blei, Davi Mimno, an Joran Boy-Graber for some material in this lecture relating to LDA Emily Fox 0 7 9
LDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling
Case Stuy 5: Mixe Membership Moeling LDA Collapse Gibbs Sampler, VariaNonal Inference Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox May 8 th, 05 Emily Fox 05 Task : Mixe
More informationGaussian Mixture Model
Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,
More informationLDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling
Case Stuy 5: Mxe Membershp Moelng LDA Collapse Gbbs Sampler, VaraNonal Inference Machne Learnng for Bg Data CSE547/STAT548, Unversty of Washngton Emly Fox May 8 th, 05 Emly Fox 05 Task : Mxe Membershp
More informationCollapsed Variational Inference for HDP
Collapse Variational Inference for HDP Yee W. Teh Davi Newman an Max Welling Publishe on NIPS 2007 Discussion le by Iulian Pruteanu Outline Introuction Hierarchical Bayesian moel for LDA Collapse VB inference
More informationLecture 2: Correlated Topic Model
Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationSparse Stochastic Inference for Latent Dirichlet Allocation
Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationLower bounds on Locality Sensitive Hashing
Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,
More informationTopic Modeling: Beyond Bag-of-Words
Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while
More informationLecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012
CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration
More informationTwo Useful Bounds for Variational Inference
Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the
More informationPart I: Web Structure Mining Chapter 1: Information Retrieval and Web Search
Part I: Web Structure Mining Chapter : Information Retrieval an Web Search The Web Challenges Crawling the Web Inexing an Keywor Search Evaluating Search Quality Similarity Search The Web Challenges Tim
More informationCollapsed Variational Inference for LDA
Collapse Variational Inference for LDA BT Thomas Yeo LDA We shall follow the same notation as Blei et al. 2003. In other wors, we consier full LDA moel with hyperparameters α anη onβ anθ respectiely, whereθparameterizes
More information. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.
S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial
More informationHomework 2 Solutions EM, Mixture Models, PCA, Dualitys
Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture
More informationStochastic Variational Inference
Stochastic Variational Inference David M. Blei Princeton University (DRAFT: DO NOT CITE) December 8, 2011 We derive a stochastic optimization algorithm for mean field variational inference, which we call
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationOn Topic Evolution. Eric P. Xing School of Computer Science Carnegie Mellon University Technical Report: CMU-CALD
On Topic Evolution Eric P. Xing School of Computer Science Carnegie Mellon University epxing@cs.cmu.eu Technical Report: CMU-CALD-05-5 December 005 Abstract I introuce topic evolution moels for longituinal
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationCS Lecture 18. Topic Models and LDA
CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same
More informationMath Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors
Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+
More informationSurvey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013
Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing
More informationEvaluation Methods for Topic Models
University of Massachusetts Amherst wallach@cs.umass.edu April 13, 2009 Joint work with Iain Murray, Ruslan Salakhutdinov and David Mimno Statistical Topic Models Useful for analyzing large, unstructured
More informationMixed-membership Models (and an introduction to variational inference)
Mixed-membership Models (and an introduction to variational inference) David M. Blei Columbia University November 24, 2015 Introduction We studied mixture models in detail, models that partition data into
More information26.1 Metropolis method
CS880: Approximations Algorithms Scribe: Dave Anrzejewski Lecturer: Shuchi Chawla Topic: Metropolis metho, volume estimation Date: 4/26/07 The previous lecture iscusse they some of the key concepts of
More informationAnalyzing Tensor Power Method Dynamics in Overcomplete Regime
Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical
More informationLecture 2 Lagrangian formulation of classical mechanics Mechanics
Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,
More informationIN the evolution of the Internet, there have been
1 Tag-Weighte Topic Moel For Large-scale Semi-Structure Documents Shuangyin Li, Jiefei Li, Guan Huang, Ruiyang Tan, an Rong Pan arxiv:1507.08396v1 [cs.cl] 30 Jul 2015 Abstract To ate, there have been massive
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.
More informationProof of SPNs as Mixture of Trees
A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a
More informationAdmin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016
Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationRobust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k
A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine
More informationText Mining for Economics and Finance Latent Dirichlet Allocation
Text Mining for Economics and Finance Latent Dirichlet Allocation Stephen Hansen Text Mining Lecture 5 1 / 45 Introduction Recall we are interested in mixed-membership modeling, but that the plsi model
More informationTopic 7: Convergence of Random Variables
Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to
More informationTable of Common Derivatives By David Abraham
Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec
More informationTopic Models. Charles Elkan November 20, 2008
Topic Models Charles Elan elan@cs.ucsd.edu November 20, 2008 Suppose that we have a collection of documents, and we want to find an organization for these, i.e. we want to do unsupervised learning. One
More informationTopic Uncovering and Image Annotation via Scalable Probit Normal Correlated Topic Models
Rochester Institute of Technology RIT Scholar Wors Theses Thesis/Dissertation Collections 5-2015 Topic Uncovering an Image Annotation via Scalable Probit Normal Correlate Topic Moels Xingchen Yu Follow
More informationIntroduction to Bayesian inference
Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions
More informationLatent Dirichlet Allocation in Web Spam Filtering
Latent Dirichlet Allocation in Web Spam Filtering István Bíró Jácint Szabó Anrás A. Benczúr Data Mining an Web search Research Group, Informatics Laboratory Computer an Automation Research Institute of
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationLatent Variable Models and EM algorithm
Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic
More informationLatent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003. Following slides borrowed ant then heavily modified from: Jonathan Huang
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationNecessary and Sufficient Conditions for Sketched Subspace Clustering
Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This
More informationNote for plsa and LDA-Version 1.1
Note for plsa and LDA-Version 1.1 Wayne Xin Zhao March 2, 2011 1 Disclaimer In this part of PLSA, I refer to [4, 5, 1]. In LDA part, I refer to [3, 2]. Due to the limit of my English ability, in some place,
More informationNote 1: Varitional Methods for Latent Dirichlet Allocation
Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content
More informationIntroduction to Machine Learning
How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2014-2015 Jakob Verbeek, ovember 21, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationThis module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics
This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.
More informationDiscrete Mathematics
Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner
More informationThermal conductivity of graded composites: Numerical simulations and an effective medium approximation
JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University
More informationFactorized Multi-Modal Topic Model
Factorize Multi-Moal Topic Moel Seppo Virtanen 1, Yangqing Jia 2, Arto Klami 1, Trevor Darrell 2 1 Helsini Institute for Information Technology HIIT Department of Information an Compute Science, Aalto
More informationu!i = a T u = 0. Then S satisfies
Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace
More information5-4 Electrostatic Boundary Value Problems
11/8/4 Section 54 Electrostatic Bounary Value Problems blank 1/ 5-4 Electrostatic Bounary Value Problems Reaing Assignment: pp. 149-157 Q: A: We must solve ifferential equations, an apply bounary conitions
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationA PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,
More informationTopic Modelling and Latent Dirichlet Allocation
Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer
More informationAnother Walkthrough of Variational Bayes. Bevan Jones Machine Learning Reading Group Macquarie University
Another Walkthrough of Variational Bayes Bevan Jones Machine Learning Reading Group Macquarie University 2 Variational Bayes? Bayes Bayes Theorem But the integral is intractable! Sampling Gibbs, Metropolis
More informationProbabilistic Graphical Models for Image Analysis - Lecture 4
Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.
More informationComparative Approaches of Calculation of the Back Water Curves in a Trapezoidal Channel with Weak Slope
Proceeings of the Worl Congress on Engineering Vol WCE, July 6-8,, Lonon, U.K. Comparative Approaches of Calculation of the Back Water Curves in a Trapezoial Channel with Weak Slope Fourar Ali, Chiremsel
More informationLatent Variable Models
Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:
More informationWUCHEN LI AND STANLEY OSHER
CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability
More informationOnline but Accurate Inference for Latent Variable Models with Local Gibbs Sampling
Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling Christophe Dupuy INRIA - Technicolor christophe.dupuy@inria.fr Francis Bach INRIA - ENS francis.bach@inria.fr Abstract
More informationGibbs Sampling. Héctor Corrada Bravo. University of Maryland, College Park, USA CMSC 644:
Gibbs Sampling Héctor Corrada Bravo University of Maryland, College Park, USA CMSC 644: 2019 03 27 Latent semantic analysis Documents as mixtures of topics (Hoffman 1999) 1 / 60 Latent semantic analysis
More informationLecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations
Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:
More informationMulti-View Clustering via Canonical Correlation Analysis
Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationImplicit Differentiation
Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,
More informationOptimization of Geometries by Energy Minimization
Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.
More informationEquilibrium in Queues Under Unknown Service Times and Service Value
University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University
More informationLecture 1b. Differential operators and orthogonal coordinates. Partial derivatives. Divergence and divergence theorem. Gradient. A y. + A y y dy. 1b.
b. Partial erivatives Lecture b Differential operators an orthogonal coorinates Recall from our calculus courses that the erivative of a function can be efine as f ()=lim 0 or using the central ifference
More informationtopic modeling hanna m. wallach
university of massachusetts amherst wallach@cs.umass.edu Ramona Blei-Gantz Helen Moss (Dave's Grandma) The Next 30 Minutes Motivations and a brief history: Latent semantic analysis Probabilistic latent
More informationApplying LDA topic model to a corpus of Italian Supreme Court decisions
Applying LDA topic model to a corpus of Italian Supreme Court decisions Paolo Fantini Statistical Service of the Ministry of Justice - Italy CESS Conference - Rome - November 25, 2014 Our goal finding
More informationConvergence of Random Walks
Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of
More informationLecture 8: Graphical models for Text
Lecture 8: Graphical models for Text 4F13: Machine Learning Joaquin Quiñonero-Candela and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/
More informationPattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM
Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures
More informationFast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine
Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Nitish Srivastava nitish@cs.toronto.edu Ruslan Salahutdinov rsalahu@cs.toronto.edu Geoffrey Hinton hinton@cs.toronto.edu
More informationChapter 4. Electrostatics of Macroscopic Media
Chapter 4. Electrostatics of Macroscopic Meia 4.1 Multipole Expansion Approximate potentials at large istances 3 x' x' (x') x x' x x Fig 4.1 We consier the potential in the far-fiel region (see Fig. 4.1
More informationMulti-View Clustering via Canonical Correlation Analysis
Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms
More informationJUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson
JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises
More informationARCH 614 Note Set 5 S2012abn. Moments & Supports
RCH 614 Note Set 5 S2012abn Moments & Supports Notation: = perpenicular istance to a force from a point = name for force vectors or magnitue of a force, as is P, Q, R x = force component in the x irection
More informationCascaded redundancy reduction
Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,
More informationTopic Modeling Ensembles
Topic Moeling Ensembles Zhiyong Shen, Ping Luo, Shengen Yang, Xukun Shen HP Laboratories HPL-2-58 Keyor(s): Topic moel, Ensemble Abstract: In this paper e propose a frameork of topic moeling ensembles,
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationLecture 6 : Dimensionality Reduction
CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing
More informationA Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank
A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics,
More information7.1 Support Vector Machine
67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to
More informationEuler equations for multiple integrals
Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................
More informationReplicated Softmax: an Undirected Topic Model. Stephen Turner
Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationIntroduction To Machine Learning
Introduction To Machine Learning David Sontag New York University Lecture 21, April 14, 2016 David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14 Expectation maximization
More informationSmoothed Gradients for Stochastic Variational Inference
Smoothed Gradients for Stochastic Variational Inference Stephan Mandt Department of Physics Princeton University smandt@princeton.edu David Blei Department of Computer Science Department of Statistics
More informationBayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems
Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Scott W. Linderman Matthew J. Johnson Andrew C. Miller Columbia University Harvard and Google Brain Harvard University Ryan
More informationLatent Dirichlet Alloca/on
Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which
More informationKNN Particle Filters for Dynamic Hybrid Bayesian Networks
KNN Particle Filters for Dynamic Hybri Bayesian Networs H. D. Chen an K. C. Chang Dept. of Systems Engineering an Operations Research George Mason University MS 4A6, 4400 University Dr. Fairfax, VA 22030
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More information