Lecture 2: Correlated Topic Model

Size: px
Start display at page:

Download "Lecture 2: Correlated Topic Model"

Transcription

1 Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables in the moel. Let K be the number of topics, D be the number of ocuments, V be the number of terms in the vocabulary. We use i to inex a topic, to inex a ocument 2, n inex a wor 3 an w (or v to enote a wor. In correlate topic moels, µ K, Σ K K an β K V are moel parameters, while η D K an 4 are hien variables. As a variational istribution q(, we use a fully factorie moel, where all the variables are inepenently governe by a ifferent istribution, q(η, λ, ν, ϕ q(η λ, νq( ϕ, (. λ D K,ν D K an ϕ 5 are hien variables. are variational parameters. Note that the only assumption we have mae in the variational inference is that η an are inepenent an we o not specify any probabilities functions for these two hien variables. The topic assignments of wors an the ocuments are exchangeable, i.e., inepenent conitione on the parameters (either moel parameters or variational parameters. Note that all the variational istribution q( is a conitional istribution an shoul be written as q( w, for simplicity we write it as q(. The main iea is that we use variational expectation-maximiation (EM: In the E-step variational EM, we use the variational approximation to the posterior escribe in the previous section an fin the optimal values of variational parameters. In the M-step, we maximie the boun with respect to the moel parameters. In a more conense way, we perform variational inference for learning variational parameters in E-step while perform parameter estimation in M-step. These two steps alternate in a iteration. We will optimie the lower boun w.r.t variational parameters an moel parameters one by one, an this is to perform optimiation using a coorinate ascent algorithm.. Variational objective function.. Fining a lower boun for log p(w µ, Σ, β Jensens inequality. Let X be a ranom variable, an f is a convex function. Then we have f(e(x E(f(x. If f is a concave function we have f(e(x E(f(x. i means K i 2 means D 3 n means N n, where N is the length of current ocument 4 It is represente as a three imension matrix, each entry is inexe by a triplet <, n, i >, inicating whether the topic assignment of the nth wor in the th ocument is the ith topic. 5 Corresponing to, it is represente as a three imension matrix, each entry is inexe by a triplet <, n, i >, inicating the probability of the nth wor in the th ocument in the ith topic. -

2 -2 Lecture 2: Correlate Topic Moel We use Jensens inequality to boun the log probability of a ocument 6, log p(w µ, Σ, β log p(η,, w µ, Σ, βη, log p(η,, w µ, Σ, βq(η, η, q(η, q(η, log p(η,, w µ, Σ, βη q(η, log q(η, η, q(η, log p(η µ, Ση + q(η, log p( ηη + q(η, log p(w µ,, βη q(η, log q(η, η, E q [log p(η µ, Σ] + E q [log p( η] + E q [log p(w µ,, β] + H(q, L(λ, ϕ µ, Σ, β E q [log p(η µ, Σ] + E q [log p( η] + E q [log p(w µ,, β] + H(q. (.2 We can easily verify that log p(w µ, Σ, β L(λ, ϕ µ, Σ, β + D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w. (.3 We inee fin a lower boun for log p(w µ, Σ, β, i.e., L(λ, ν, ϕ µ, Σ, β. This shows that maximiing the lower boun L(λ, ν, ϕ µ, Σ, β with respect to λ,ν an ϕ is equivalent to minimiing the KL ivergence between the variational posterior probability an the true posterior probability, the optimiation problem presente earlier in Eq Expaning the lower boun E q [log p(η µ, Σ] (.4 2 log Σ K 2 log 2π 2 E q[(η µ T Σ (η µ] (.5 2 log Σ K 2 log 2π 2 T race(iag(ν2 Σ 2 (λ µt Σ (λ µ. (.6 Let n enote the topic assignment of the nth wor in current ocument,an it is a vector. n,i when the topic assignment is the ith topic, otherwise, n,i 0. E q [log p( η] n E q [log p( n η] (.7 n,i E q [ n,i log exp(η i i exp(η i ] (.8 n,i E q [ n,i η i ] n E q [log i exp(η i ] (.9 6 Currently, we ignore the ocument inex, since all the hien variables an variational parameters are ocument-specific. But we will use the ocument inex explicitly in the part of parameter estimation since these moel parameters are relate to all the ocuments. 7 When we learn the variational parameters, we fix all the moel parameters. So that log p(w µ, Σ, β can be consiere as a fixe value an it is the sum of the lower boun an the KL ivergence.

3 Lecture 2: Correlate Topic Moel -3 We can have E q [ n,i η i ] λ i ϕ n,i. It is a bit ifficult to erive n E q[log i exp(η i ]. To preserve the lower boun on the log probability, we upper boun the negative log normalier with a Taylor expansion: E q [log ( exp(η i ] ζ E q [exp(η i ] + log(ζ, (.0 i i where we have introuce a new slack parameter ζ. The expectation E q [exp(η k ] is the mean of a log normal istribution with mean an variance obtaine from the variational parameters {λ i, νi 2}: E q[exp(η i ] exp(λ i + νi 2 /2. Using this aitional boun, the right sie of Eq. is E q [log p( η] λ i ϕ n,i ( (ζ i exp(λ i + ν 2i /2 + log(ζ. (. n,i n E q [log p(w µ,, β] n E q [log p(w n n, β] (.2 n,i n,i E q [log β n,i i,w n ] (.3 ϕ n,i log β i,wn. (.4 H(q q(η, log q(η, η, (.5 q(η log q(ηη + q( log q(, (.6 i 2 (log ν2 i + log 2π + ϕ n,i log ϕ n,i. (.7 n,i We also present the etaile erivations for q(η log q(ηη i q(η log q(ηη (.8 q(η i λ i, νi 2 log q(η i λ i, νi 2 η i, (.9 i i i exp( (η i λ i 2πν 2 i 2ν 2 i exp( (η i λ i 2 2πν 2 i 2ν 2 i ( (ηi λ i 2νi 2 + log( 2 log 2πν2 i η i, (.20 ( (ηi λ i 2 2νi 2 + log( 2 log 2πν2 i η i, (.2 2 ( + log 2π + log ν2 i. (.22 Here we use a property of a Gaussian istribution p(x

4 -4 Lecture 2: Correlate Topic Moel (x µ 2 p(xx δ 2. (µ is the mean an δ 2 is the variance. (.23.2 Variational inference The aim of variational inference is to learn the values of variational parameters λ, ν, ϕ. With the learnt variational parameters, we can evaluate the posterior probabilities of hien variables. Having specifie a simplifie family of probability istributions, the next step is to set up an optimiation problem that etermines the values of the variational parameters: λ, ν, ϕ. We can obtain a solution for these variational variables by solve the following optimiation problem: (λ, ν, ϕ arg min D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w. (.24 λ,ν,ϕ With Eq..3, we can achieve minimie D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w by maximiing the lower boun L(λ, ϕ µ, Σ, β..2. Learning the variational parameters We have expane each item of the lower boun L(λ, ϕ µ, Σ, β in Eq..2. Then we maximie the boun with respect to the variational parameters: λ, ν, ϕ an the slack variable ζ we have introuce. First, we maximie Eq..0 with respect to ζ an the erivative with respect to ζ is ( ( L/ζ N ζ 2 i exp(λ i + νi 2 /2 ζ, (.25 which has a maximum at ˆζ i exp(λ i + ν 2 i /2. (.26 Secon, we maximie with respect to ϕ n,i. We can have which has a maximum at L/ϕ n,i log β i,wn log ϕ n,i + λ i + τ n (Lagrange Multiplier, (.27 Then we optimie the Gaussian variational parameters λ an ν. For λ, we have the erivative ˆ ϕ n,i exp(λ i β i,wn. (.28 L/λ Σ (λ µ + n ϕ n,:k N ζ exp(λ + ν 2 /2, (.29

5 Lecture 2: Correlate Topic Moel -5 where ϕ n,:k is a column vector. Here we use a property of matrix graient: x T Ax/x 2A if A is symmetric matric an x is a vector. We cannot obtain a close form solution of λ, thus we can use the above erivative of λ using an optimiation algorithm, e.g., conjugate graient algorithm. Finally, we have the erivative of ν 2,i 8 L/ν,i Σ ii 2 N 2ζ exp(λ,i + ν,i/ ν,i 2, (.30 again, we have analytic solution an we can use Newton s metho with the constraint ν,i > 0. We o not want to present the etails of these optimiation methos (e.g., Newton s metho, generally speaking, it is easy to use when we have the erivatives..3 Parameter estimation In this section, we continue to estimate our moel parameters, i.e., β, µ an Σ. We solve this problem by using the variational lower boun as a surrogate for the (intractable marginal log likelihoo, with the variational parameters. Note that we shoul first aggregate ocument-specific lower bouns efine in Eq..2. An in this part, we will use the ocument inex. We first rewrite the lower boun by only keeping the items which contain β with the lagrange multipliers ρ i s L [β] V ϕ,n,i log β i,wn + ρ i ( β i,v. (.3,n,i i v By taking the erivative L [β], we can have L [β] /β i,v,n ϕ,n,i (v w n β i,v + ρ i, (.32 where (v w n is an inicator function which returns when the conition is true otherwise returns 0. We can set ϕ,n,i (vw n,n β i,v + ρ i to ero, an solve ρ i : ρ i,n,v ϕ,n,i(v w n. Since we have v β i,v, we can ignore ρ i to estimate an un-normalie value of β i,v βˆ i,v ϕ,n,i (v w n. (.33,n Similarly, we can rewrite the lower boun by only keeping the items which contain µ L [µ] 2 (λ µ T Σ (λ µ. (.34 8 NOT ν,i

6 -6 Lecture 2: Correlate Topic Moel By taking the erivative L [µ], we can have L [µ] /µ Σ (λ µ, (.35 By setting L [µ] /µ to ero, we have Then we continue to write all the items containing Σ of the lower boun ˆµ λ. (.36 D L [Σ] D 2 log Σ D 2 log Σ D 2 log Σ 2 T race(iag(ν2 Σ 2 2 T race(iag(ν2 Σ 2 2 T race(iag(ν2 Σ 2 ( (λ µ T Σ (λ µ, (.37 ( T race (λ µ T Σ (λ µ, (.38 T race (Σ (λ µ(λ µ T. (.39 In the above, we use the trace trick: For square matrices A an B, we have T race(ab T race(ba. In the next, we will take the erivative of L [Σ] w.r.t Σ. We use the following properties: log A /A A T ; 2 T race(ab/a T race(ba/a B T. L [Σ] /Σ DΣ T iag(ν 2 ( (λ µ(λ µ T T, (.40 so that we can have ˆΣ ( iag(ν 2 D + (λ µ(λ µ T. (.4.4 Discussion of the convergence We can iscussion the convergence of vem for CTM in a sloppy way: either the change of value of the conitional likelihoo log p(w µ, Σ, β, or the change of value of the lower boun L(λ, ν, ϕ µ, Σ, β. Since we perform a coorinate optimiation on the lower boun L(λ, ν, ϕ µ, Σ, β in both E-step an M-step, it can achieve optimal an converge. For log p(w µ, Σ, β, in the E-step, we increases its lower boun w.r.t variational parameters; while in the M- step, we further increase its lower boun, so the likelihoo probably increases. However, D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w is usually non-ero, it might ecrease after optimiing over these moel parameters. It is a bit confusing whether the likelihoo will converge although its lower boun never ecreases.

Note 1: Varitional Methods for Latent Dirichlet Allocation

Note 1: Varitional Methods for Latent Dirichlet Allocation Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content

More information

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

LDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling

LDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling Case Stuy 5: Mixe Membership Moeling LDA Collapse Gibbs Sampler, VariaNonal Inference Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox May 8 th, 05 Emily Fox 05 Task : Mixe

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Lecture 10: October 30, 2017

Lecture 10: October 30, 2017 Information an Coing Theory Autumn 2017 Lecturer: Mahur Tulsiani Lecture 10: October 30, 2017 1 I-Projections an applications In this lecture, we will talk more about fining the istribution in a set Π

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

Collapsed Variational Inference for LDA

Collapsed Variational Inference for LDA Collapse Variational Inference for LDA BT Thomas Yeo LDA We shall follow the same notation as Blei et al. 2003. In other wors, we consier full LDA moel with hyperparameters α anη onβ anθ respectiely, whereθparameterizes

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

Physics 5153 Classical Mechanics. The Virial Theorem and The Poisson Bracket-1

Physics 5153 Classical Mechanics. The Virial Theorem and The Poisson Bracket-1 Physics 5153 Classical Mechanics The Virial Theorem an The Poisson Bracket 1 Introuction In this lecture we will consier two applications of the Hamiltonian. The first, the Virial Theorem, applies to systems

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

Collapsed Variational Inference for HDP

Collapsed Variational Inference for HDP Collapse Variational Inference for HDP Yee W. Teh Davi Newman an Max Welling Publishe on NIPS 2007 Discussion le by Iulian Pruteanu Outline Introuction Hierarchical Bayesian moel for LDA Collapse VB inference

More information

G j dq i + G j. q i. = a jt. and

G j dq i + G j. q i. = a jt. and Lagrange Multipliers Wenesay, 8 September 011 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine

More information

A. Exclusive KL View of the MLE

A. Exclusive KL View of the MLE A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function

More information

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs Problem Sheet 2: Eigenvalues an eigenvectors an their use in solving linear ODEs If you fin any typos/errors in this problem sheet please email jk28@icacuk The material in this problem sheet is not examinable

More information

Self-normalized Martingale Tail Inequality

Self-normalized Martingale Tail Inequality Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value

More information

Entanglement is not very useful for estimating multiple phases

Entanglement is not very useful for estimating multiple phases PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Make graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides

Make graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides Reference 1: Transformations of Graphs an En Behavior of Polynomial Graphs Transformations of graphs aitive constant constant on the outsie g(x) = + c Make graph of g by aing c to the y-values on the graph

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Neural Networks. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Neural Networks. Tobias Scheffer Universität Potsam Institut für Informatik Lehrstuhl Maschinelles Lernen Neural Networks Tobias Scheffer Overview Neural information processing. Fee-forwar networks. Training fee-forwar networks, back

More information

Probabilistic Graphical Models for Image Analysis - Lecture 4

Probabilistic Graphical Models for Image Analysis - Lecture 4 Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.

More information

Conservation laws a simple application to the telegraph equation

Conservation laws a simple application to the telegraph equation J Comput Electron 2008 7: 47 51 DOI 10.1007/s10825-008-0250-2 Conservation laws a simple application to the telegraph equation Uwe Norbrock Reinhol Kienzler Publishe online: 1 May 2008 Springer Scienceusiness

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

The Principle of Least Action and Designing Fiber Optics

The Principle of Least Action and Designing Fiber Optics University of Southampton Department of Physics & Astronomy Year 2 Theory Labs The Principle of Least Action an Designing Fiber Optics 1 Purpose of this Moule We will be intereste in esigning fiber optic

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Lagrangian and Hamiltonian Mechanics

Lagrangian and Hamiltonian Mechanics Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.

More information

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v4 [math.pr] 27 Jul 2016 The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

and from it produce the action integral whose variation we set to zero:

and from it produce the action integral whose variation we set to zero: Lagrange Multipliers Monay, 6 September 01 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

Chapter 4. Electrostatics of Macroscopic Media

Chapter 4. Electrostatics of Macroscopic Media Chapter 4. Electrostatics of Macroscopic Meia 4.1 Multipole Expansion Approximate potentials at large istances 3 x' x' (x') x x' x x Fig 4.1 We consier the potential in the far-fiel region (see Fig. 4.1

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

1 Math 285 Homework Problem List for S2016

1 Math 285 Homework Problem List for S2016 1 Math 85 Homework Problem List for S016 Note: solutions to Lawler Problems will appear after all of the Lecture Note Solutions. 1.1 Homework 1. Due Friay, April 8, 016 Look at from lecture note exercises:

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Some Examples. Uniform motion. Poisson processes on the real line

Some Examples. Uniform motion. Poisson processes on the real line Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]

More information

LINEAR DIFFERENTIAL EQUATIONS OF ORDER 1. where a(x) and b(x) are functions. Observe that this class of equations includes equations of the form

LINEAR DIFFERENTIAL EQUATIONS OF ORDER 1. where a(x) and b(x) are functions. Observe that this class of equations includes equations of the form LINEAR DIFFERENTIAL EQUATIONS OF ORDER 1 We consier ifferential equations of the form y + a()y = b(), (1) y( 0 ) = y 0, where a() an b() are functions. Observe that this class of equations inclues equations

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,

More information

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum October 6, 4 ARDB Note Analytic Scaling Formulas for Crosse Laser Acceleration in Vacuum Robert J. Noble Stanfor Linear Accelerator Center, Stanfor University 575 San Hill Roa, Menlo Park, California 945

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

arxiv: v2 [math.pr] 27 Nov 2018

arxiv: v2 [math.pr] 27 Nov 2018 Range an spee of rotor wals on trees arxiv:15.57v [math.pr] 7 Nov 1 Wilfrie Huss an Ecaterina Sava-Huss November, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

An Introduction to Expectation-Maximization

An Introduction to Expectation-Maximization An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative

More information

Code_Aster. Detection of the singularities and computation of a card of size of elements

Code_Aster. Detection of the singularities and computation of a card of size of elements Titre : Détection es singularités et calcul une carte [...] Date : 0/0/0 Page : /6 Responsable : Josselin DLMAS Clé : R4.0.04 Révision : 9755 Detection of the singularities an computation of a car of size

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

On Topic Evolution. Eric P. Xing School of Computer Science Carnegie Mellon University Technical Report: CMU-CALD

On Topic Evolution. Eric P. Xing School of Computer Science Carnegie Mellon University Technical Report: CMU-CALD On Topic Evolution Eric P. Xing School of Computer Science Carnegie Mellon University epxing@cs.cmu.eu Technical Report: CMU-CALD-05-5 December 005 Abstract I introuce topic evolution moels for longituinal

More information

there is no special reason why the value of y should be fixed at y = 0.3. Any y such that

there is no special reason why the value of y should be fixed at y = 0.3. Any y such that 25. More on bivariate functions: partial erivatives integrals Although we sai that the graph of photosynthesis versus temperature in Lecture 16 is like a hill, in the real worl hills are three-imensional

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Monotonicity for excited random walk in high dimensions

Monotonicity for excited random walk in high dimensions Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter

More information

Lecture 6: Calculus. In Song Kim. September 7, 2011

Lecture 6: Calculus. In Song Kim. September 7, 2011 Lecture 6: Calculus In Song Kim September 7, 20 Introuction to Differential Calculus In our previous lecture we came up with several ways to analyze functions. We saw previously that the slope of a linear

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

11.7. Implicit Differentiation. Introduction. Prerequisites. Learning Outcomes

11.7. Implicit Differentiation. Introduction. Prerequisites. Learning Outcomes Implicit Differentiation 11.7 Introuction This Section introuces implicit ifferentiation which is use to ifferentiate functions expresse in implicit form (where the variables are foun together). Examples

More information

Topic Modeling: Beyond Bag-of-Words

Topic Modeling: Beyond Bag-of-Words Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while

More information

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection Online Appenix for Trae Policy uner Monopolistic Competition with Firm Selection Kyle Bagwell Stanfor University an NBER Seung Hoon Lee Georgia Institute of Technology September 6, 2018 In this Online

More information

WUCHEN LI AND STANLEY OSHER

WUCHEN LI AND STANLEY OSHER CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability

More information

Range and speed of rotor walks on trees

Range and speed of rotor walks on trees Range an spee of rotor wals on trees Wilfrie Huss an Ecaterina Sava-Huss May 15, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial configuration on regular trees

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Differentiation ( , 9.5)

Differentiation ( , 9.5) Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the

More information

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Generative learning methods for bags of features

Generative learning methods for bags of features Generative learning methos for bags of features Moel the robability of a bag of features given a class Many slies aate from Fei-Fei Li, Rob Fergus, an Antonio Torralba Generative methos We ill cover to

More information

Expectation Maximization and Mixtures of Gaussians

Expectation Maximization and Mixtures of Gaussians Statistical Machine Learning Notes 10 Expectation Maximiation and Mixtures of Gaussians Instructor: Justin Domke Contents 1 Introduction 1 2 Preliminary: Jensen s Inequality 2 3 Expectation Maximiation

More information

Code_Aster. Detection of the singularities and calculation of a map of size of elements

Code_Aster. Detection of the singularities and calculation of a map of size of elements Titre : Détection es singularités et calcul une carte [...] Date : 0/0/0 Page : /6 Responsable : DLMAS Josselin Clé : R4.0.04 Révision : Detection of the singularities an calculation of a map of size of

More information

LDA with Amortized Inference

LDA with Amortized Inference LDA with Amortied Inference Nanbo Sun Abstract This report describes how to frame Latent Dirichlet Allocation LDA as a Variational Auto- Encoder VAE and use the Amortied Variational Inference AVI to optimie

More information

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

Summary: Differentiation

Summary: Differentiation Techniques of Differentiation. Inverse Trigonometric functions The basic formulas (available in MF5 are: Summary: Differentiation ( sin ( cos The basic formula can be generalize as follows: Note: ( sin

More information

But if z is conditioned on, we need to model it:

But if z is conditioned on, we need to model it: Partially Unobserved Variables Lecture 8: Unsupervised Learning & EM Algorithm Sam Roweis October 28, 2003 Certain variables q in our models may be unobserved, either at training time or at test time or

More information

Witten s Proof of Morse Inequalities

Witten s Proof of Morse Inequalities Witten s Proof of Morse Inequalities by Igor Prokhorenkov Let M be a smooth, compact, oriente manifol with imension n. A Morse function is a smooth function f : M R such that all of its critical points

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

WEIGHTING A RESAMPLED PARTICLE IN SEQUENTIAL MONTE CARLO. L. Martino, V. Elvira, F. Louzada

WEIGHTING A RESAMPLED PARTICLE IN SEQUENTIAL MONTE CARLO. L. Martino, V. Elvira, F. Louzada WEIGHTIG A RESAMPLED PARTICLE I SEQUETIAL MOTE CARLO L. Martino, V. Elvira, F. Louzaa Dep. of Signal Theory an Communic., Universia Carlos III e Mari, Leganés (Spain). Institute of Mathematical Sciences

More information

Introduction to variational calculus: Lecture notes 1

Introduction to variational calculus: Lecture notes 1 October 10, 2006 Introuction to variational calculus: Lecture notes 1 Ewin Langmann Mathematical Physics, KTH Physics, AlbaNova, SE-106 91 Stockholm, Sween Abstract I give an informal summary of variational

More information