Lecture 2: Correlated Topic Model
|
|
- Everett Hicks
- 5 years ago
- Views:
Transcription
1 Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables in the moel. Let K be the number of topics, D be the number of ocuments, V be the number of terms in the vocabulary. We use i to inex a topic, to inex a ocument 2, n inex a wor 3 an w (or v to enote a wor. In correlate topic moels, µ K, Σ K K an β K V are moel parameters, while η D K an 4 are hien variables. As a variational istribution q(, we use a fully factorie moel, where all the variables are inepenently governe by a ifferent istribution, q(η, λ, ν, ϕ q(η λ, νq( ϕ, (. λ D K,ν D K an ϕ 5 are hien variables. are variational parameters. Note that the only assumption we have mae in the variational inference is that η an are inepenent an we o not specify any probabilities functions for these two hien variables. The topic assignments of wors an the ocuments are exchangeable, i.e., inepenent conitione on the parameters (either moel parameters or variational parameters. Note that all the variational istribution q( is a conitional istribution an shoul be written as q( w, for simplicity we write it as q(. The main iea is that we use variational expectation-maximiation (EM: In the E-step variational EM, we use the variational approximation to the posterior escribe in the previous section an fin the optimal values of variational parameters. In the M-step, we maximie the boun with respect to the moel parameters. In a more conense way, we perform variational inference for learning variational parameters in E-step while perform parameter estimation in M-step. These two steps alternate in a iteration. We will optimie the lower boun w.r.t variational parameters an moel parameters one by one, an this is to perform optimiation using a coorinate ascent algorithm.. Variational objective function.. Fining a lower boun for log p(w µ, Σ, β Jensens inequality. Let X be a ranom variable, an f is a convex function. Then we have f(e(x E(f(x. If f is a concave function we have f(e(x E(f(x. i means K i 2 means D 3 n means N n, where N is the length of current ocument 4 It is represente as a three imension matrix, each entry is inexe by a triplet <, n, i >, inicating whether the topic assignment of the nth wor in the th ocument is the ith topic. 5 Corresponing to, it is represente as a three imension matrix, each entry is inexe by a triplet <, n, i >, inicating the probability of the nth wor in the th ocument in the ith topic. -
2 -2 Lecture 2: Correlate Topic Moel We use Jensens inequality to boun the log probability of a ocument 6, log p(w µ, Σ, β log p(η,, w µ, Σ, βη, log p(η,, w µ, Σ, βq(η, η, q(η, q(η, log p(η,, w µ, Σ, βη q(η, log q(η, η, q(η, log p(η µ, Ση + q(η, log p( ηη + q(η, log p(w µ,, βη q(η, log q(η, η, E q [log p(η µ, Σ] + E q [log p( η] + E q [log p(w µ,, β] + H(q, L(λ, ϕ µ, Σ, β E q [log p(η µ, Σ] + E q [log p( η] + E q [log p(w µ,, β] + H(q. (.2 We can easily verify that log p(w µ, Σ, β L(λ, ϕ µ, Σ, β + D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w. (.3 We inee fin a lower boun for log p(w µ, Σ, β, i.e., L(λ, ν, ϕ µ, Σ, β. This shows that maximiing the lower boun L(λ, ν, ϕ µ, Σ, β with respect to λ,ν an ϕ is equivalent to minimiing the KL ivergence between the variational posterior probability an the true posterior probability, the optimiation problem presente earlier in Eq Expaning the lower boun E q [log p(η µ, Σ] (.4 2 log Σ K 2 log 2π 2 E q[(η µ T Σ (η µ] (.5 2 log Σ K 2 log 2π 2 T race(iag(ν2 Σ 2 (λ µt Σ (λ µ. (.6 Let n enote the topic assignment of the nth wor in current ocument,an it is a vector. n,i when the topic assignment is the ith topic, otherwise, n,i 0. E q [log p( η] n E q [log p( n η] (.7 n,i E q [ n,i log exp(η i i exp(η i ] (.8 n,i E q [ n,i η i ] n E q [log i exp(η i ] (.9 6 Currently, we ignore the ocument inex, since all the hien variables an variational parameters are ocument-specific. But we will use the ocument inex explicitly in the part of parameter estimation since these moel parameters are relate to all the ocuments. 7 When we learn the variational parameters, we fix all the moel parameters. So that log p(w µ, Σ, β can be consiere as a fixe value an it is the sum of the lower boun an the KL ivergence.
3 Lecture 2: Correlate Topic Moel -3 We can have E q [ n,i η i ] λ i ϕ n,i. It is a bit ifficult to erive n E q[log i exp(η i ]. To preserve the lower boun on the log probability, we upper boun the negative log normalier with a Taylor expansion: E q [log ( exp(η i ] ζ E q [exp(η i ] + log(ζ, (.0 i i where we have introuce a new slack parameter ζ. The expectation E q [exp(η k ] is the mean of a log normal istribution with mean an variance obtaine from the variational parameters {λ i, νi 2}: E q[exp(η i ] exp(λ i + νi 2 /2. Using this aitional boun, the right sie of Eq. is E q [log p( η] λ i ϕ n,i ( (ζ i exp(λ i + ν 2i /2 + log(ζ. (. n,i n E q [log p(w µ,, β] n E q [log p(w n n, β] (.2 n,i n,i E q [log β n,i i,w n ] (.3 ϕ n,i log β i,wn. (.4 H(q q(η, log q(η, η, (.5 q(η log q(ηη + q( log q(, (.6 i 2 (log ν2 i + log 2π + ϕ n,i log ϕ n,i. (.7 n,i We also present the etaile erivations for q(η log q(ηη i q(η log q(ηη (.8 q(η i λ i, νi 2 log q(η i λ i, νi 2 η i, (.9 i i i exp( (η i λ i 2πν 2 i 2ν 2 i exp( (η i λ i 2 2πν 2 i 2ν 2 i ( (ηi λ i 2νi 2 + log( 2 log 2πν2 i η i, (.20 ( (ηi λ i 2 2νi 2 + log( 2 log 2πν2 i η i, (.2 2 ( + log 2π + log ν2 i. (.22 Here we use a property of a Gaussian istribution p(x
4 -4 Lecture 2: Correlate Topic Moel (x µ 2 p(xx δ 2. (µ is the mean an δ 2 is the variance. (.23.2 Variational inference The aim of variational inference is to learn the values of variational parameters λ, ν, ϕ. With the learnt variational parameters, we can evaluate the posterior probabilities of hien variables. Having specifie a simplifie family of probability istributions, the next step is to set up an optimiation problem that etermines the values of the variational parameters: λ, ν, ϕ. We can obtain a solution for these variational variables by solve the following optimiation problem: (λ, ν, ϕ arg min D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w. (.24 λ,ν,ϕ With Eq..3, we can achieve minimie D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w by maximiing the lower boun L(λ, ϕ µ, Σ, β..2. Learning the variational parameters We have expane each item of the lower boun L(λ, ϕ µ, Σ, β in Eq..2. Then we maximie the boun with respect to the variational parameters: λ, ν, ϕ an the slack variable ζ we have introuce. First, we maximie Eq..0 with respect to ζ an the erivative with respect to ζ is ( ( L/ζ N ζ 2 i exp(λ i + νi 2 /2 ζ, (.25 which has a maximum at ˆζ i exp(λ i + ν 2 i /2. (.26 Secon, we maximie with respect to ϕ n,i. We can have which has a maximum at L/ϕ n,i log β i,wn log ϕ n,i + λ i + τ n (Lagrange Multiplier, (.27 Then we optimie the Gaussian variational parameters λ an ν. For λ, we have the erivative ˆ ϕ n,i exp(λ i β i,wn. (.28 L/λ Σ (λ µ + n ϕ n,:k N ζ exp(λ + ν 2 /2, (.29
5 Lecture 2: Correlate Topic Moel -5 where ϕ n,:k is a column vector. Here we use a property of matrix graient: x T Ax/x 2A if A is symmetric matric an x is a vector. We cannot obtain a close form solution of λ, thus we can use the above erivative of λ using an optimiation algorithm, e.g., conjugate graient algorithm. Finally, we have the erivative of ν 2,i 8 L/ν,i Σ ii 2 N 2ζ exp(λ,i + ν,i/ ν,i 2, (.30 again, we have analytic solution an we can use Newton s metho with the constraint ν,i > 0. We o not want to present the etails of these optimiation methos (e.g., Newton s metho, generally speaking, it is easy to use when we have the erivatives..3 Parameter estimation In this section, we continue to estimate our moel parameters, i.e., β, µ an Σ. We solve this problem by using the variational lower boun as a surrogate for the (intractable marginal log likelihoo, with the variational parameters. Note that we shoul first aggregate ocument-specific lower bouns efine in Eq..2. An in this part, we will use the ocument inex. We first rewrite the lower boun by only keeping the items which contain β with the lagrange multipliers ρ i s L [β] V ϕ,n,i log β i,wn + ρ i ( β i,v. (.3,n,i i v By taking the erivative L [β], we can have L [β] /β i,v,n ϕ,n,i (v w n β i,v + ρ i, (.32 where (v w n is an inicator function which returns when the conition is true otherwise returns 0. We can set ϕ,n,i (vw n,n β i,v + ρ i to ero, an solve ρ i : ρ i,n,v ϕ,n,i(v w n. Since we have v β i,v, we can ignore ρ i to estimate an un-normalie value of β i,v βˆ i,v ϕ,n,i (v w n. (.33,n Similarly, we can rewrite the lower boun by only keeping the items which contain µ L [µ] 2 (λ µ T Σ (λ µ. (.34 8 NOT ν,i
6 -6 Lecture 2: Correlate Topic Moel By taking the erivative L [µ], we can have L [µ] /µ Σ (λ µ, (.35 By setting L [µ] /µ to ero, we have Then we continue to write all the items containing Σ of the lower boun ˆµ λ. (.36 D L [Σ] D 2 log Σ D 2 log Σ D 2 log Σ 2 T race(iag(ν2 Σ 2 2 T race(iag(ν2 Σ 2 2 T race(iag(ν2 Σ 2 ( (λ µ T Σ (λ µ, (.37 ( T race (λ µ T Σ (λ µ, (.38 T race (Σ (λ µ(λ µ T. (.39 In the above, we use the trace trick: For square matrices A an B, we have T race(ab T race(ba. In the next, we will take the erivative of L [Σ] w.r.t Σ. We use the following properties: log A /A A T ; 2 T race(ab/a T race(ba/a B T. L [Σ] /Σ DΣ T iag(ν 2 ( (λ µ(λ µ T T, (.40 so that we can have ˆΣ ( iag(ν 2 D + (λ µ(λ µ T. (.4.4 Discussion of the convergence We can iscussion the convergence of vem for CTM in a sloppy way: either the change of value of the conitional likelihoo log p(w µ, Σ, β, or the change of value of the lower boun L(λ, ν, ϕ µ, Σ, β. Since we perform a coorinate optimiation on the lower boun L(λ, ν, ϕ µ, Σ, β in both E-step an M-step, it can achieve optimal an converge. For log p(w µ, Σ, β, in the E-step, we increases its lower boun w.r.t variational parameters; while in the M- step, we further increase its lower boun, so the likelihoo probably increases. However, D(q(η, λ, ν, ϕ p(η, µ, Σ, β, w is usually non-ero, it might ecrease after optimiing over these moel parameters. It is a bit confusing whether the likelihoo will converge although its lower boun never ecreases.
Note 1: Varitional Methods for Latent Dirichlet Allocation
Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content
More informationCollapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling
Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example
More informationRobust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k
A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine
More informationLDA Collapsed Gibbs Sampler, VariaNonal Inference. Task 3: Mixed Membership Models. Case Study 5: Mixed Membership Modeling
Case Stuy 5: Mixe Membership Moeling LDA Collapse Gibbs Sampler, VariaNonal Inference Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox May 8 th, 05 Emily Fox 05 Task : Mixe
More informationConvergence of Random Walks
Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of
More informationSurvey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013
Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing
More informationEuler equations for multiple integrals
Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................
More informationTime-of-Arrival Estimation in Non-Line-Of-Sight Environments
2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor
More informationLeast-Squares Regression on Sparse Spaces
Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction
More informationLecture 10: October 30, 2017
Information an Coing Theory Autumn 2017 Lecturer: Mahur Tulsiani Lecture 10: October 30, 2017 1 I-Projections an applications In this lecture, we will talk more about fining the istribution in a set Π
More informationAll s Well That Ends Well: Supplementary Proofs
All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee
More information. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.
S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial
More informationCollapsed Variational Inference for LDA
Collapse Variational Inference for LDA BT Thomas Yeo LDA We shall follow the same notation as Blei et al. 2003. In other wors, we consier full LDA moel with hyperparameters α anη onβ anθ respectiely, whereθparameterizes
More information19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control
19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior
More informationAnalyzing Tensor Power Method Dynamics in Overcomplete Regime
Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical
More informationIntroduction to Machine Learning
How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression
More informationPhysics 5153 Classical Mechanics. The Virial Theorem and The Poisson Bracket-1
Physics 5153 Classical Mechanics The Virial Theorem an The Poisson Bracket 1 Introuction In this lecture we will consier two applications of the Hamiltonian. The first, the Virial Theorem, applies to systems
More informationTutorial on Maximum Likelyhood Estimation: Parametric Density Estimation
Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing
More informationThe Press-Schechter mass function
The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for
More informationCollapsed Variational Inference for HDP
Collapse Variational Inference for HDP Yee W. Teh Davi Newman an Max Welling Publishe on NIPS 2007 Discussion le by Iulian Pruteanu Outline Introuction Hierarchical Bayesian moel for LDA Collapse VB inference
More informationG j dq i + G j. q i. = a jt. and
Lagrange Multipliers Wenesay, 8 September 011 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine
More informationA. Exclusive KL View of the MLE
A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function
More informationProblem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs
Problem Sheet 2: Eigenvalues an eigenvectors an their use in solving linear ODEs If you fin any typos/errors in this problem sheet please email jk28@icacuk The material in this problem sheet is not examinable
More informationSelf-normalized Martingale Tail Inequality
Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value
More informationEntanglement is not very useful for estimating multiple phases
PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The
More informationConservation Laws. Chapter Conservation of Energy
20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action
More informationMake graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides
Reference 1: Transformations of Graphs an En Behavior of Polynomial Graphs Transformations of graphs aitive constant constant on the outsie g(x) = + c Make graph of g by aing c to the y-values on the graph
More information1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.
Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency
More informationThe total derivative. Chapter Lagrangian and Eulerian approaches
Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function
More informationHomework 2 Solutions EM, Mixture Models, PCA, Dualitys
Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture
More information6 General properties of an autonomous system of two first order ODE
6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Neural Networks. Tobias Scheffer
Universität Potsam Institut für Informatik Lehrstuhl Maschinelles Lernen Neural Networks Tobias Scheffer Overview Neural information processing. Fee-forwar networks. Training fee-forwar networks, back
More informationProbabilistic Graphical Models for Image Analysis - Lecture 4
Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.
More informationConservation laws a simple application to the telegraph equation
J Comput Electron 2008 7: 47 51 DOI 10.1007/s10825-008-0250-2 Conservation laws a simple application to the telegraph equation Uwe Norbrock Reinhol Kienzler Publishe online: 1 May 2008 Springer Scienceusiness
More informationSchrödinger s equation.
Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of
More informationCalculus and optimization
Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function
More informationThe Principle of Least Action and Designing Fiber Optics
University of Southampton Department of Physics & Astronomy Year 2 Theory Labs The Principle of Least Action an Designing Fiber Optics 1 Purpose of this Moule We will be intereste in esigning fiber optic
More informationOn combinatorial approaches to compressed sensing
On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu
More informationDiscrete Mathematics
Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner
More informationAn Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an
More informationLagrangian and Hamiltonian Mechanics
Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical
More informationAgmon Kolmogorov Inequalities on l 2 (Z d )
Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,
More informationMath Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors
Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+
More informationThe derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)
Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)
More informationQF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim
QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.
More informationarxiv: v4 [math.pr] 27 Jul 2016
The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,
More informationensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y
Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay
More informationand from it produce the action integral whose variation we set to zero:
Lagrange Multipliers Monay, 6 September 01 Sometimes it is convenient to use reunant coorinates, an to effect the variation of the action consistent with the constraints via the metho of Lagrange unetermine
More information'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21
Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting
More informationImplicit Differentiation
Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,
More informationChapter 4. Electrostatics of Macroscopic Media
Chapter 4. Electrostatics of Macroscopic Meia 4.1 Multipole Expansion Approximate potentials at large istances 3 x' x' (x') x x' x x Fig 4.1 We consier the potential in the far-fiel region (see Fig. 4.1
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.
More information1 Math 285 Homework Problem List for S2016
1 Math 85 Homework Problem List for S016 Note: solutions to Lawler Problems will appear after all of the Lecture Note Solutions. 1.1 Homework 1. Due Friay, April 8, 016 Look at from lecture note exercises:
More informationOptimization of Geometries by Energy Minimization
Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.
More informationSome Examples. Uniform motion. Poisson processes on the real line
Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]
More informationLINEAR DIFFERENTIAL EQUATIONS OF ORDER 1. where a(x) and b(x) are functions. Observe that this class of equations includes equations of the form
LINEAR DIFFERENTIAL EQUATIONS OF ORDER 1 We consier ifferential equations of the form y + a()y = b(), (1) y( 0 ) = y 0, where a() an b() are functions. Observe that this class of equations inclues equations
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics
UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,
More informationAnalytic Scaling Formulas for Crossed Laser Acceleration in Vacuum
October 6, 4 ARDB Note Analytic Scaling Formulas for Crosse Laser Acceleration in Vacuum Robert J. Noble Stanfor Linear Accelerator Center, Stanfor University 575 San Hill Roa, Menlo Park, California 945
More informationMath 342 Partial Differential Equations «Viktor Grigoryan
Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite
More informationEquilibrium in Queues Under Unknown Service Times and Service Value
University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University
More informationarxiv: v2 [math.pr] 27 Nov 2018
Range an spee of rotor wals on trees arxiv:15.57v [math.pr] 7 Nov 1 Wilfrie Huss an Ecaterina Sava-Huss November, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial
More informationA Modification of the Jarque-Bera Test. for Normality
Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam
More informationSeparation of Variables
Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical
More informationComputing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions
Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationCode_Aster. Detection of the singularities and computation of a card of size of elements
Titre : Détection es singularités et calcul une carte [...] Date : 0/0/0 Page : /6 Responsable : Josselin DLMAS Clé : R4.0.04 Révision : 9755 Detection of the singularities an computation of a car of size
More informationFLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction
FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number
More informationTractability results for weighted Banach spaces of smooth functions
Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March
More informationLecture 2 Lagrangian formulation of classical mechanics Mechanics
Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,
More informationOn Topic Evolution. Eric P. Xing School of Computer Science Carnegie Mellon University Technical Report: CMU-CALD
On Topic Evolution Eric P. Xing School of Computer Science Carnegie Mellon University epxing@cs.cmu.eu Technical Report: CMU-CALD-05-5 December 005 Abstract I introuce topic evolution moels for longituinal
More informationthere is no special reason why the value of y should be fixed at y = 0.3. Any y such that
25. More on bivariate functions: partial erivatives integrals Although we sai that the graph of photosynthesis versus temperature in Lecture 16 is like a hill, in the real worl hills are three-imensional
More informationIntroduction to the Vlasov-Poisson system
Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its
More informationProof of SPNs as Mixture of Trees
A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a
More informationMonotonicity for excited random walk in high dimensions
Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter
More informationLecture 6: Calculus. In Song Kim. September 7, 2011
Lecture 6: Calculus In Song Kim September 7, 20 Introuction to Differential Calculus In our previous lecture we came up with several ways to analyze functions. We saw previously that the slope of a linear
More informationA PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,
More information11.7. Implicit Differentiation. Introduction. Prerequisites. Learning Outcomes
Implicit Differentiation 11.7 Introuction This Section introuces implicit ifferentiation which is use to ifferentiate functions expresse in implicit form (where the variables are foun together). Examples
More informationTopic Modeling: Beyond Bag-of-Words
Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while
More informationOnline Appendix for Trade Policy under Monopolistic Competition with Firm Selection
Online Appenix for Trae Policy uner Monopolistic Competition with Firm Selection Kyle Bagwell Stanfor University an NBER Seung Hoon Lee Georgia Institute of Technology September 6, 2018 In this Online
More informationWUCHEN LI AND STANLEY OSHER
CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability
More informationRange and speed of rotor walks on trees
Range an spee of rotor wals on trees Wilfrie Huss an Ecaterina Sava-Huss May 15, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial configuration on regular trees
More informationTopic 7: Convergence of Random Variables
Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information
More informationDifferentiation ( , 9.5)
Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the
More informationLecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations
Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:
More informationLinear First-Order Equations
5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)
More informationGenerative learning methods for bags of features
Generative learning methos for bags of features Moel the robability of a bag of features given a class Many slies aate from Fei-Fei Li, Rob Fergus, an Antonio Torralba Generative methos We ill cover to
More informationExpectation Maximization and Mixtures of Gaussians
Statistical Machine Learning Notes 10 Expectation Maximiation and Mixtures of Gaussians Instructor: Justin Domke Contents 1 Introduction 1 2 Preliminary: Jensen s Inequality 2 3 Expectation Maximiation
More informationCode_Aster. Detection of the singularities and calculation of a map of size of elements
Titre : Détection es singularités et calcul une carte [...] Date : 0/0/0 Page : /6 Responsable : DLMAS Josselin Clé : R4.0.04 Révision : Detection of the singularities an calculation of a map of size of
More informationLDA with Amortized Inference
LDA with Amortied Inference Nanbo Sun Abstract This report describes how to frame Latent Dirichlet Allocation LDA as a Variational Auto- Encoder VAE and use the Amortied Variational Inference AVI to optimie
More informationOptimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations
Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,
More informationLower bounds on Locality Sensitive Hashing
Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,
More informationMulti-View Clustering via Canonical Correlation Analysis
Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in
More informationNOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,
NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which
More informationSummary: Differentiation
Techniques of Differentiation. Inverse Trigonometric functions The basic formulas (available in MF5 are: Summary: Differentiation ( sin ( cos The basic formula can be generalize as follows: Note: ( sin
More informationBut if z is conditioned on, we need to model it:
Partially Unobserved Variables Lecture 8: Unsupervised Learning & EM Algorithm Sam Roweis October 28, 2003 Certain variables q in our models may be unobserved, either at training time or at test time or
More informationWitten s Proof of Morse Inequalities
Witten s Proof of Morse Inequalities by Igor Prokhorenkov Let M be a smooth, compact, oriente manifol with imension n. A Morse function is a smooth function f : M R such that all of its critical points
More information7.1 Support Vector Machine
67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to
More informationWEIGHTING A RESAMPLED PARTICLE IN SEQUENTIAL MONTE CARLO. L. Martino, V. Elvira, F. Louzada
WEIGHTIG A RESAMPLED PARTICLE I SEQUETIAL MOTE CARLO L. Martino, V. Elvira, F. Louzaa Dep. of Signal Theory an Communic., Universia Carlos III e Mari, Leganés (Spain). Institute of Mathematical Sciences
More informationIntroduction to variational calculus: Lecture notes 1
October 10, 2006 Introuction to variational calculus: Lecture notes 1 Ewin Langmann Mathematical Physics, KTH Physics, AlbaNova, SE-106 91 Stockholm, Sween Abstract I give an informal summary of variational
More information