Probabilistic Graphical Models

Size: px
Start display at page:

Download "Probabilistic Graphical Models"

Transcription

1 Probabilisti Graphial Models David Sontag New York University Leture 12, April 19, 2012 Aknowledgement: Partially based on slides by Eri Xing at CMU and Andrew MCallum at UMass Amherst David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

2 Today: learning undireted graphial models 1 Learning MRFs a. Feature-based (log-linear) representation of MRFs b. Maximum likelihood estimation. Maximum entropy view 2 Getting around omplexity of inferene a. Using approximate inferene (e.g., TRW) within learning b. Pseudo-likelihood 3 Conditional random fields David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

3 Reall: ML estimation in Bayesian networks Maximum likelihood estimation: max θ l(θ; D), where l(θ; D) = log p(d; θ) = log p(x; θ) = log p(x i ˆx pa(i)) i ˆx pa(i) : x pa(i) =ˆx pa(i) In Bayesian networks, we have the losed form ML solution: θ ML x i x pa(i) = N x i,x pa(i) ˆx i Nˆxi,x pa(i) where N xi,x pa(i) is the number of times that the (partial) assignment x i, x pa(i) is observed in the training data We were able to estimate eah CPD independently beause the objetive deomposes by variable and parent assignment David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

4 Bad news for Markov networks The global normalization onstant Z(θ) kills deomposability: θ ML = arg max θ = arg max θ = arg max θ log p(x; θ) ( ) log φ (x ; θ) log Z(θ) ( ) log φ (x ; θ) D log Z(θ) The log-partition funtion prevents us from deomposing the objetive into a sum over terms for eah potential Solving for the parameters beomes muh more ompliated David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

5 What are the parameters? How do we parameterize φ (x ; θ)? Use a log-linear parameterization: Introdue weights w R d that are used globally For eah potential, a vetor-valued feature funtion f (x ) R d Then, φ (x ; w) = exp(w f (x )) Example: disrete-valued MRF with only edge potentials, where eah variable takes k states Let d = k 2 E, and let w i,j,xi,x j = log φ ij (x i, x j ) Let f i,j (x i, x j ) have a 1 in the dimension orresponding to (i, j, x i, x j ) and 0 elsewhere The joint distribution is in the exponential family! p(x; w) = exp{w f(x) log Z(w)}, where f (x) = f (x ) and Z(w) = x exp{ w f (x )} This formulation allows for parameter sharing David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

6 Log-likelihood for log-linear models θ ML = arg max θ = arg max w ( ) log φ (x ; θ) D log Z(θ) ( ) w f (x ) D log Z(w) = arg max w w The first term is linear in w ( ) f (x ) D log Z(w) The seond term is also a funtion of w: log Z(w) = log ( exp w ) f (x ) x David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

7 Log-likelihood for log-linear models log Z(w) = log x log Z(w) does not deompose ( exp w ) f (x ) No losed form solution; even omputing likelihood requires inferene Reall Problem 4 ( Exponential families ) from Problem Set 2. Letting f(x) = f (x ), you showed that w log Z(w) = E p(x;w) [f(x)] = E p(x ;w)[f (x )] Thus, the gradient of the log-partition funtion an be omputed by inferene, omputing marginals with respet to the urrent parameters w We also laimed that the 2nd derivative of the log-partition funtion gives the seond-order moments, i.e. 2 log Z(w) = ov[f(x)] Sine ovariane matries are always positive semi-definite, this proves that log Z(w) is onvex (so log Z(w) is onave) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

8 Solving the maximum likelihood problem in MRFs l(w; D) = 1 D w ( ) f (x ) log Z(w) First, note that the weights w are unonstrained, i.e. w R d The objetive funtion is jointly onave. Apply any onvex optimization method to learn! Can use gradient asent, stohasti gradient asent, quasi-newton methods suh as limited memory BFGS (L-BFGS) The gradient of the log-likelihood is: d l(w; D) = 1 dw k D = (f (x )) k 1 D (f (x )) k E p(x ;w)[(f (x )) k ] E p(x ;w)[(f (x )) k ] David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

9 The gradient of the log-likelihood l(w; D) = 1 (x )) k w k D (f Differene of expetations! E p(x ;w)[(f (x )) k ] Consider the earlier pairwise MRF example. This then redues to: ( ) 1 l(w; D) = 1[x i = ˆx i, x j = ˆx j ] p(ˆx i, ˆx j ; w) w i,j,ˆxi,ˆx j D Setting derivative to zero, we see that for the maximum likelihood parameters w ML, we have p(ˆx i, ˆx j ; w ML ) = 1 1[x i = ˆx i, x j = ˆx j ] D for all edges ij E and states ˆx i, ˆx j Model marginals for eah lique equal the empirial marginals! Called moment mathing, and is a property of maximum likelihood learning in exponential families David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

10 Gradient asent requires repeated marginal inferene, whih in many models is hard! We will return to this shortly. David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

11 Maximum entropy (MaxEnt) We an approah the modeling task from an entirely different point of view Suppose we know some expetations with respet to a (fully general) distribution p(x): (true) 1 p(x)f i (x), (empirial) f i (x) = α i D x Assuming that the expetations are onsistent with one another, there may exist many distributions whih satisfy them. Whih one should we selet? The most unertain or flexible one, i.e., the one with maximum entropy. This yields a new optimization problem: s.t. max H(p(x)) = p x p(x)f i (x) = α i x p(x) = 1 p(x) log p(x) (stritly onave w.r.t. p(x)) David Sontag (NYU) x Graphial Models Leture 12, April 19, / 21

12 What does the MaxEnt solution look like? To solve the MaxEnt problem, we form the Lagrangian: L = p(x) log p(x) ( ) ( ) λ i p(x)f i (x) α i µ p(x) 1 x i x x Then, taking the derivative of the Lagrangian, And setting to zero, we obtain: ( L p(x) = 1 log p(x) i λ i f i (x) µ p (x) = exp 1 µ λ i f i (x) = e 1 µ e i λ i f i (x) i From the onstraint x p(x) = 1 we obtain e1+µ = x e i λ i f i (x) = Z(λ) We onlude that the maximum entropy distribution has the form (substituting w i = λ i ) p (x) = 1 Z(w) exp( i ) w i f i (x)) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

13 Equivalene of maximum likelihood and maximum entropy Feature onstraints + MaxEnt exponential family! We have seen a ase of onvex duality: In one ase, we assume exponential family and show that ML implies model expetations must math empirial expetations In the other ase, we assume model expetations must math empirial feature ounts and show that MaxEnt implies exponential family distribution Can show that one is the dual of the other, and thus both obtain the same value of the objetive at optimality (no duality gap) Besides providing insight into the ML solution, this also gives an alternative way to (approximately) solve the learning problem David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

14 How an we get around the omplexity of inferene during learning? David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

15 Monte Carlo methods Reall the original learning objetive ( ) l(w; D) = 1 D w f (x ) log Z(w) Use any of the sampling approahes (e.g., Gibbs sampling) that we disussed in Leture 9 All we need for learning (i.e., to ompute the derivative of l(w, D)) are marginals of the distribution No need to ever estimate log Z(w) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

16 Using approximations of the log-partition funtion We an substitute the original learning objetive l(w; D) = 1 ( D w ) f (x ) log Z(w) with one that uses a tratable approximation of the log-partition funtion: l(w; D) = 1 ( D w ) f (x ) log Z(w) Reall from Leture 8 that we ame up with a onvex relaxation that provided an upper bound on the log-partition funtion, log Z(w) log Z(w) (e.g., tree-reweighted belief propagation, log-determinant relaxation) Using this, we obtain a lower bound on the learning objetive l(w; D) l(w; D) Again, to ompute the derivatives we only need pseudo-marginals from the variational inferene algorithm David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

17 Pseudo-likelihood Alternatively, an we ome up with a different objetive funtion (i.e., a different estimator) whih sueeds at learning while avoiding inferene altogether? Pseudo-likelihood method (Besag 1971) yields an exat solution if the data is generated by a model in our model family p(x; θ ) and D (i.e., it is onsistent) Note that, via the hain rule, p(x; w) = i p(x i x 1,..., x i 1 ; w) We onsider the following approximation: p(x; w) i p(x i x 1,..., x i 1, x i+1,..., x n ; w) = i p(x i x i ; w) where we have added onditioning over additional variables David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

18 Pseudo-likelihood The pseudo-likelihood method replaes the likelihood, l(θ; D) = 1 1 log p(d; θ) = D D with the following approximation: l PL (w; D) = 1 D D n m=1 i=1 (we replaed x i with x N(i), i s Markov blanket) D log p(x m ; θ) m=1 log p(x m i x m N(i) ; w) For example, suppose we have a pairwise MRF. Then, p(xi m xn(i) m ; w) = 1 Z(xN(i) m ; j N(i) θ m ij (xi,x m j ), Z(x m w)e N(i) ; w) = ˆxi e j N(i) θ ij (ˆx i,x m j ) More generally, and using the log-linear parameterization, we have: log p(xi m xn(i) m ; w) = w f (x m ) log Z(xN(i) m ; w) :i David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

19 Pseudo-likelihood This objetive only involves summation over x i and is tratable Has many small partition funtions (one for eah variable and eah setting of its neighbors) instead of one big one It is still onave in w and thus has no loal maxima Assuming the data is drawn from a MRF with parameters w, an show that as the number of data points gets large, w PL w David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

20 Conditional random fields Reall from Leture 4, a CRF is a Markov network on variables X Y, whih speifies the onditional distribution P(y x) = 1 φ (x, y ) Z(x) C with partition funtion Z(x) = ŷ φ (x, ŷ ). C The feature funtions now depend on x in addition to y For eah potential, a vetor-valued feature funtion f (x, y ) R d Then, φ (x, y ; w) = exp(w f (x, y )) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

21 Learning with onditional random fields Exat same as learning with MRFs, exept that we have a different partition funtion for eah data point ( ) θ ML = arg max log φ (x, y ; θ) log Z(x; θ) θ (x,y) D = arg max w f (x, y ) log Z(x; w) w (x,y) D (x,y) D David Sontag (NYU) Graphial Models Leture 12, April 19, / 21

Maximum Entropy and Exponential Families

Maximum Entropy and Exponential Families Maximum Entropy and Exponential Families April 9, 209 Abstrat The goal of this note is to derive the exponential form of probability distribution from more basi onsiderations, in partiular Entropy. It

More information

Sensitivity Analysis in Markov Networks

Sensitivity Analysis in Markov Networks Sensitivity Analysis in Markov Networks Hei Chan and Adnan Darwihe Computer Siene Department University of California, Los Angeles Los Angeles, CA 90095 {hei,darwihe}@s.ula.edu Abstrat This paper explores

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilisti Graphial Models 0-708 Undireted Graphial Models Eri Xing Leture, Ot 7, 2005 Reading: MJ-Chap. 2,4, and KF-hap5 Review: independene properties of DAGs Defn: let I l (G) be the set of loal independene

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Complexity of Regularization RBF Networks

Complexity of Regularization RBF Networks Complexity of Regularization RBF Networks Mark A Kon Department of Mathematis and Statistis Boston University Boston, MA 02215 mkon@buedu Leszek Plaskota Institute of Applied Mathematis University of Warsaw

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

CSC2515 Winter 2015 Introduc3on to Machine Learning. Lecture 5: Clustering, mixture models, and EM

CSC2515 Winter 2015 Introduc3on to Machine Learning. Lecture 5: Clustering, mixture models, and EM CSC2515 Winter 2015 Introdu3on to Mahine Learning Leture 5: Clustering, mixture models, and EM All leture slides will be available as.pdf on the ourse website: http://www.s.toronto.edu/~urtasun/ourses/csc2515/

More information

Methods of evaluating tests

Methods of evaluating tests Methods of evaluating tests Let X,, 1 Xn be i.i.d. Bernoulli( p ). Then 5 j= 1 j ( 5, ) T = X Binomial p. We test 1 H : p vs. 1 1 H : p>. We saw that a LRT is 1 if t k* φ ( x ) =. otherwise (t is the observed

More information

Advanced Computational Fluid Dynamics AA215A Lecture 4

Advanced Computational Fluid Dynamics AA215A Lecture 4 Advaned Computational Fluid Dynamis AA5A Leture 4 Antony Jameson Winter Quarter,, Stanford, CA Abstrat Leture 4 overs analysis of the equations of gas dynamis Contents Analysis of the equations of gas

More information

max min z i i=1 x j k s.t. j=1 x j j:i T j

max min z i i=1 x j k s.t. j=1 x j j:i T j AM 221: Advaned Optimization Spring 2016 Prof. Yaron Singer Leture 22 April 18th 1 Overview In this leture, we will study the pipage rounding tehnique whih is a deterministi rounding proedure that an be

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Product Policy in Markets with Word-of-Mouth Communication. Technical Appendix

Product Policy in Markets with Word-of-Mouth Communication. Technical Appendix rodut oliy in Markets with Word-of-Mouth Communiation Tehnial Appendix August 05 Miro-Model for Inreasing Awareness In the paper, we make the assumption that awareness is inreasing in ustomer type. I.e.,

More information

10.5 Unsupervised Bayesian Learning

10.5 Unsupervised Bayesian Learning The Bayes Classifier Maximum-likelihood methods: Li Yu Hongda Mao Joan Wang parameter vetor is a fixed but unknown value Bayes methods: parameter vetor is a random variable with known prior distribution

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Hankel Optimal Model Order Reduction 1

Hankel Optimal Model Order Reduction 1 Massahusetts Institute of Tehnology Department of Eletrial Engineering and Computer Siene 6.245: MULTIVARIABLE CONTROL SYSTEMS by A. Megretski Hankel Optimal Model Order Redution 1 This leture overs both

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

A NETWORK SIMPLEX ALGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM

A NETWORK SIMPLEX ALGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM NETWORK SIMPLEX LGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM Cen Çalışan, Utah Valley University, 800 W. University Parway, Orem, UT 84058, 801-863-6487, en.alisan@uvu.edu BSTRCT The minimum

More information

Lecture 7: Sampling/Projections for Least-squares Approximation, Cont. 7 Sampling/Projections for Least-squares Approximation, Cont.

Lecture 7: Sampling/Projections for Least-squares Approximation, Cont. 7 Sampling/Projections for Least-squares Approximation, Cont. Stat60/CS94: Randomized Algorithms for Matries and Data Leture 7-09/5/013 Leture 7: Sampling/Projetions for Least-squares Approximation, Cont. Leturer: Mihael Mahoney Sribe: Mihael Mahoney Warning: these

More information

A brief introduction to Conditional Random Fields

A brief introduction to Conditional Random Fields A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood

More information

Danielle Maddix AA238 Final Project December 9, 2016

Danielle Maddix AA238 Final Project December 9, 2016 Struture and Parameter Learning in Bayesian Networks with Appliations to Prediting Breast Caner Tumor Malignany in a Lower Dimension Feature Spae Danielle Maddix AA238 Final Projet Deember 9, 2016 Abstrat

More information

Sensitivity analysis for linear optimization problem with fuzzy data in the objective function

Sensitivity analysis for linear optimization problem with fuzzy data in the objective function Sensitivity analysis for linear optimization problem with fuzzy data in the objetive funtion Stephan Dempe, Tatiana Starostina May 5, 2004 Abstrat Linear programming problems with fuzzy oeffiients in the

More information

A new initial search direction for nonlinear conjugate gradient method

A new initial search direction for nonlinear conjugate gradient method International Journal of Mathematis Researh. ISSN 0976-5840 Volume 6, Number 2 (2014), pp. 183 190 International Researh Publiation House http://www.irphouse.om A new initial searh diretion for nonlinear

More information

The Effectiveness of the Linear Hull Effect

The Effectiveness of the Linear Hull Effect The Effetiveness of the Linear Hull Effet S. Murphy Tehnial Report RHUL MA 009 9 6 Otober 009 Department of Mathematis Royal Holloway, University of London Egham, Surrey TW0 0EX, England http://www.rhul.a.uk/mathematis/tehreports

More information

Where as discussed previously we interpret solutions to this partial differential equation in the weak sense: b

Where as discussed previously we interpret solutions to this partial differential equation in the weak sense: b Consider the pure initial value problem for a homogeneous system of onservation laws with no soure terms in one spae dimension: Where as disussed previously we interpret solutions to this partial differential

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Learning Parameters of Undirected Models. Sargur Srihari

Learning Parameters of Undirected Models. Sargur Srihari Learning Parameters of Undirected Models Sargur srihari@cedar.buffalo.edu 1 Topics Log-linear Parameterization Likelihood Function Maximum Likelihood Parameter Estimation Simple and Conjugate Gradient

More information

Average Rate Speed Scaling

Average Rate Speed Scaling Average Rate Speed Saling Nikhil Bansal David P. Bunde Ho-Leung Chan Kirk Pruhs May 2, 2008 Abstrat Speed saling is a power management tehnique that involves dynamially hanging the speed of a proessor.

More information

Taste for variety and optimum product diversity in an open economy

Taste for variety and optimum product diversity in an open economy Taste for variety and optimum produt diversity in an open eonomy Javier Coto-Martínez City University Paul Levine University of Surrey Otober 0, 005 María D.C. Garía-Alonso University of Kent Abstrat We

More information

LECTURE NOTES FOR , FALL 2004

LECTURE NOTES FOR , FALL 2004 LECTURE NOTES FOR 18.155, FALL 2004 83 12. Cone support and wavefront set In disussing the singular support of a tempered distibution above, notie that singsupp(u) = only implies that u C (R n ), not as

More information

Error Bounds for Context Reduction and Feature Omission

Error Bounds for Context Reduction and Feature Omission Error Bounds for Context Redution and Feature Omission Eugen Bek, Ralf Shlüter, Hermann Ney,2 Human Language Tehnology and Pattern Reognition, Computer Siene Department RWTH Aahen University, Ahornstr.

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

( ) ( ) Volumetric Properties of Pure Fluids, part 4. The generic cubic equation of state:

( ) ( ) Volumetric Properties of Pure Fluids, part 4. The generic cubic equation of state: CE304, Spring 2004 Leture 6 Volumetri roperties of ure Fluids, part 4 The generi ubi equation of state: There are many possible equations of state (and many have been proposed) that have the same general

More information

Probabilistic Graphical Models & Applications

Probabilistic Graphical Models & Applications Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with

More information

REFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS. 1. Introduction

REFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS. 1. Introduction Version of 5/2/2003 To appear in Advanes in Applied Mathematis REFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS MATTHIAS BECK AND SHELEMYAHU ZACKS Abstrat We study the Frobenius problem:

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Learning Parameters of Undirected Models. Sargur Srihari

Learning Parameters of Undirected Models. Sargur Srihari Learning Parameters of Undirected Models Sargur srihari@cedar.buffalo.edu 1 Topics Difficulties due to Global Normalization Likelihood Function Maximum Likelihood Parameter Estimation Simple and Conjugate

More information

n n=1 (air) n 1 sin 2 r =

n n=1 (air) n 1 sin 2 r = Physis 55 Fall 7 Homework Assignment #11 Solutions Textbook problems: Ch. 7: 7.3, 7.4, 7.6, 7.8 7.3 Two plane semi-infinite slabs of the same uniform, isotropi, nonpermeable, lossless dieletri with index

More information

Optimization of Statistical Decisions for Age Replacement Problems via a New Pivotal Quantity Averaging Approach

Optimization of Statistical Decisions for Age Replacement Problems via a New Pivotal Quantity Averaging Approach Amerian Journal of heoretial and Applied tatistis 6; 5(-): -8 Published online January 7, 6 (http://www.sienepublishinggroup.om/j/ajtas) doi:.648/j.ajtas.s.65.4 IN: 36-8999 (Print); IN: 36-96 (Online)

More information

Conditional Random Field

Conditional Random Field Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions

More information

Chapter 8 Hypothesis Testing

Chapter 8 Hypothesis Testing Leture 5 for BST 63: Statistial Theory II Kui Zhang, Spring Chapter 8 Hypothesis Testing Setion 8 Introdution Definition 8 A hypothesis is a statement about a population parameter Definition 8 The two

More information

Time Domain Method of Moments

Time Domain Method of Moments Time Domain Method of Moments Massahusetts Institute of Tehnology 6.635 leture notes 1 Introdution The Method of Moments (MoM) introdued in the previous leture is widely used for solving integral equations

More information

The shape of a hanging chain. a project in the calculus of variations

The shape of a hanging chain. a project in the calculus of variations The shape of a hanging hain a projet in the alulus of variations April 15, 218 2 Contents 1 Introdution 5 2 Analysis 7 2.1 Model............................... 7 2.2 Extremal graphs.........................

More information

Control Theory association of mathematics and engineering

Control Theory association of mathematics and engineering Control Theory assoiation of mathematis and engineering Wojieh Mitkowski Krzysztof Oprzedkiewiz Department of Automatis AGH Univ. of Siene & Tehnology, Craow, Poland, Abstrat In this paper a methodology

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Research Collection. Mismatched decoding for the relay channel. Conference Paper. ETH Library. Author(s): Hucher, Charlotte; Sadeghi, Parastoo

Research Collection. Mismatched decoding for the relay channel. Conference Paper. ETH Library. Author(s): Hucher, Charlotte; Sadeghi, Parastoo Researh Colletion Conferene Paper Mismathed deoding for the relay hannel Author(s): Huher, Charlotte; Sadeghi, Parastoo Publiation Date: 2010 Permanent Link: https://doi.org/10.3929/ethz-a-005997152 Rights

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

When p = 1, the solution is indeterminate, but we get the correct answer in the limit.

When p = 1, the solution is indeterminate, but we get the correct answer in the limit. The Mathematia Journal Gambler s Ruin and First Passage Time Jan Vrbik We investigate the lassial problem of a gambler repeatedly betting $1 on the flip of a potentially biased oin until he either loses

More information

SOA/CAS MAY 2003 COURSE 1 EXAM SOLUTIONS

SOA/CAS MAY 2003 COURSE 1 EXAM SOLUTIONS SOA/CAS MAY 2003 COURSE 1 EXAM SOLUTIONS Prepared by S. Broverman e-mail 2brove@rogers.om website http://members.rogers.om/2brove 1. We identify the following events:. - wathed gymnastis, ) - wathed baseball,

More information

Model-based mixture discriminant analysis an experimental study

Model-based mixture discriminant analysis an experimental study Model-based mixture disriminant analysis an experimental study Zohar Halbe and Mayer Aladjem Department of Eletrial and Computer Engineering, Ben-Gurion University of the Negev P.O.Box 653, Beer-Sheva,

More information

Solutions to Problem Set 1

Solutions to Problem Set 1 Eon602: Maro Theory Eonomis, HKU Instrutor: Dr. Yulei Luo September 208 Solutions to Problem Set. [0 points] Consider the following lifetime optimal onsumption-saving problem: v (a 0 ) max f;a t+ g t t

More information

Singular Event Detection

Singular Event Detection Singular Event Detetion Rafael S. Garía Eletrial Engineering University of Puerto Rio at Mayagüez Rafael.Garia@ee.uprm.edu Faulty Mentor: S. Shankar Sastry Researh Supervisor: Jonathan Sprinkle Graduate

More information

Case I: 2 users In case of 2 users, the probability of error for user 1 was earlier derived to be 2 A1

Case I: 2 users In case of 2 users, the probability of error for user 1 was earlier derived to be 2 A1 MUTLIUSER DETECTION (Letures 9 and 0) 6:33:546 Wireless Communiations Tehnologies Instrutor: Dr. Narayan Mandayam Summary By Shweta Shrivastava (shwetash@winlab.rutgers.edu) bstrat This artile ontinues

More information

Lecture 8: PGM Inference

Lecture 8: PGM Inference 15 September 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I 1 Variable elimination Max-product Sum-product 2 LP Relaxations QP Relaxations 3 Marginal and MAP X1 X2 X3 X4

More information

A Queueing Model for Call Blending in Call Centers

A Queueing Model for Call Blending in Call Centers A Queueing Model for Call Blending in Call Centers Sandjai Bhulai and Ger Koole Vrije Universiteit Amsterdam Faulty of Sienes De Boelelaan 1081a 1081 HV Amsterdam The Netherlands E-mail: {sbhulai, koole}@s.vu.nl

More information

15.12 Applications of Suffix Trees

15.12 Applications of Suffix Trees 248 Algorithms in Bioinformatis II, SoSe 07, ZBIT, D. Huson, May 14, 2007 15.12 Appliations of Suffix Trees 1. Searhing for exat patterns 2. Minimal unique substrings 3. Maximum unique mathes 4. Maximum

More information

CS 687 Jana Kosecka. Uncertainty, Bayesian Networks Chapter 13, Russell and Norvig Chapter 14,

CS 687 Jana Kosecka. Uncertainty, Bayesian Networks Chapter 13, Russell and Norvig Chapter 14, CS 687 Jana Koseka Unertainty Bayesian Networks Chapter 13 Russell and Norvig Chapter 14 14.1-14.3 Outline Unertainty robability Syntax and Semantis Inferene Independene and Bayes' Rule Syntax Basi element:

More information

Analysis of discretization in the direct simulation Monte Carlo

Analysis of discretization in the direct simulation Monte Carlo PHYSICS OF FLUIDS VOLUME 1, UMBER 1 OCTOBER Analysis of disretization in the diret simulation Monte Carlo iolas G. Hadjionstantinou a) Department of Mehanial Engineering, Massahusetts Institute of Tehnology,

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Tests of fit for symmetric variance gamma distributions

Tests of fit for symmetric variance gamma distributions Tests of fit for symmetri variane gamma distributions Fragiadakis Kostas UADPhilEon, National and Kapodistrian University of Athens, 4 Euripidou Street, 05 59 Athens, Greee. Keywords: Variane Gamma Distribution,

More information

Study of EM waves in Periodic Structures (mathematical details)

Study of EM waves in Periodic Structures (mathematical details) Study of EM waves in Periodi Strutures (mathematial details) Massahusetts Institute of Tehnology 6.635 partial leture notes 1 Introdution: periodi media nomenlature 1. The spae domain is defined by a basis,(a

More information

Subject: Introduction to Component Matching and Off-Design Operation % % ( (1) R T % (

Subject: Introduction to Component Matching and Off-Design Operation % % ( (1) R T % ( 16.50 Leture 0 Subjet: Introdution to Component Mathing and Off-Design Operation At this point it is well to reflet on whih of the many parameters we have introdued (like M, τ, τ t, ϑ t, f, et.) are free

More information

2.2 BUDGET-CONSTRAINED CHOICE WITH TWO COMMODITIES

2.2 BUDGET-CONSTRAINED CHOICE WITH TWO COMMODITIES Essential Miroeonomis -- 22 BUDGET-CONSTRAINED CHOICE WITH TWO COMMODITIES Continuity of demand 2 Inome effets 6 Quasi-linear, Cobb-Douglas and CES referenes 9 Eenditure funtion 4 Substitution effets and

More information

ELECTROMAGNETIC WAVES

ELECTROMAGNETIC WAVES ELECTROMAGNETIC WAVES Now we will study eletromagneti waves in vauum or inside a medium, a dieletri. (A metalli system an also be represented as a dieletri but is more ompliated due to damping or attenuation

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Lecture 3 - Lorentz Transformations

Lecture 3 - Lorentz Transformations Leture - Lorentz Transformations A Puzzle... Example A ruler is positioned perpendiular to a wall. A stik of length L flies by at speed v. It travels in front of the ruler, so that it obsures part of the

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Lecture 13 Bragg-Williams Theory

Lecture 13 Bragg-Williams Theory Leture 13 Bragg-Williams Theory As noted in Chapter 11, an alternative mean-field approah is to derive a free energy, F, in terms of our order parameter,m, and then minimize F with respet to m. We begin

More information

Transformation to approximate independence for locally stationary Gaussian processes

Transformation to approximate independence for locally stationary Gaussian processes ransformation to approximate independene for loally stationary Gaussian proesses Joseph Guinness, Mihael L. Stein We provide new approximations for the likelihood of a time series under the loally stationary

More information

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010

More information

Directional Coupler. 4-port Network

Directional Coupler. 4-port Network Diretional Coupler 4-port Network 3 4 A diretional oupler is a 4-port network exhibiting: All ports mathed on the referene load (i.e. S =S =S 33 =S 44 =0) Two pair of ports unoupled (i.e. the orresponding

More information

Common Value Auctions with Costly Entry

Common Value Auctions with Costly Entry Common Value Autions with Costly Entry Pauli Murto Juuso Välimäki June, 205 preliminary and inomplete Abstrat We onsider a model where potential bidders onsider paying an entry ost to partiipate in an

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Energy Based Models. Stefano Ermon, Aditya Grover. Stanford University. Lecture 13

Energy Based Models. Stefano Ermon, Aditya Grover. Stanford University. Lecture 13 Energy Based Models Stefano Ermon, Aditya Grover Stanford University Lecture 13 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 13 1 / 21 Summary Story so far Representation: Latent

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Estimation for Unknown Parameters of the Extended Burr Type-XII Distribution Based on Type-I Hybrid Progressive Censoring Scheme

Estimation for Unknown Parameters of the Extended Burr Type-XII Distribution Based on Type-I Hybrid Progressive Censoring Scheme Estimation for Unknown Parameters of the Extended Burr Type-XII Distribution Based on Type-I Hybrid Progressive Censoring Sheme M.M. Amein Department of Mathematis Statistis, Faulty of Siene, Taif University,

More information

Some facts you should know that would be convenient when evaluating a limit:

Some facts you should know that would be convenient when evaluating a limit: Some fats you should know that would be onvenient when evaluating a it: When evaluating a it of fration of two funtions, f(x) x a g(x) If f and g are both ontinuous inside an open interval that ontains

More information

2 The Bayesian Perspective of Distributions Viewed as Information

2 The Bayesian Perspective of Distributions Viewed as Information A PRIMER ON BAYESIAN INFERENCE For the next few assignments, we are going to fous on the Bayesian way of thinking and learn how a Bayesian approahes the problem of statistial modeling and inferene. The

More information

Errata and changes for Lecture Note 1 (I would like to thank Tomasz Sulka for the following changes): ( ) ( ) lim = should be

Errata and changes for Lecture Note 1 (I would like to thank Tomasz Sulka for the following changes): ( ) ( ) lim = should be Errata and hanges for Leture Note (I would like to thank Tomasz Sulka for the following hanges): Page 5 of LN: f f ' lim should be g g' f f ' lim lim g g ' Page 8 of LN: the following words (in RED) have

More information

SPLINE ESTIMATION OF SINGLE-INDEX MODELS

SPLINE ESTIMATION OF SINGLE-INDEX MODELS SPLINE ESIMAION OF SINGLE-INDEX MODELS Li Wang and Lijian Yang University of Georgia and Mihigan State University Supplementary Material his note ontains proofs for the main results he following two propositions

More information

CMSC 451: Lecture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017

CMSC 451: Lecture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017 CMSC 451: Leture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017 Reading: Chapt 11 of KT and Set 54 of DPV Set Cover: An important lass of optimization problems involves overing a ertain domain,

More information

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E')

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E') 22.54 Neutron Interations and Appliations (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E') Referenes -- J. R. Lamarsh, Introdution to Nulear Reator Theory (Addison-Wesley, Reading, 1966),

More information

36-720: Log-Linear Models: Three-Way Tables

36-720: Log-Linear Models: Three-Way Tables 36-720: Log-Linear Models: Three-Way Tables Brian Junker September 10, 2007 Hierarhial Speifiation of Log-Linear Models Higher-Dimensional Tables Digression: Why Poisson If We Believe Multinomial? Produt

More information

ON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION. Dobromir P. Kralchev

ON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION. Dobromir P. Kralchev Pliska Stud. Math. Bulgar. 8 2007, 83 94 STUDIA MATHEMATICA BULGARICA ON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION Dobromir P. Kralhev Consider the probability that the Brownian motion

More information

Stochastic Combinatorial Optimization with Risk Evdokia Nikolova

Stochastic Combinatorial Optimization with Risk Evdokia Nikolova Computer Siene and Artifiial Intelligene Laboratory Tehnial Report MIT-CSAIL-TR-2008-055 September 13, 2008 Stohasti Combinatorial Optimization with Risk Evdokia Nikolova massahusetts institute of tehnology,

More information

Some recent developments in probability distributions

Some recent developments in probability distributions Proeedings 59th ISI World Statistis Congress, 25-30 August 2013, Hong Kong (Session STS084) p.2873 Some reent developments in probability distributions Felix Famoye *1, Carl Lee 1, and Ayman Alzaatreh

More information

An iterative least-square method suitable for solving large sparse matrices

An iterative least-square method suitable for solving large sparse matrices An iteratie least-square method suitable for soling large sparse matries By I. M. Khabaza The purpose of this paper is to report on the results of numerial experiments with an iteratie least-square method

More information

Tutorial 4 (week 4) Solutions

Tutorial 4 (week 4) Solutions THE UNIVERSITY OF SYDNEY PURE MATHEMATICS Summer Shool Math26 28 Tutorial week s You are given the following data points: x 2 y 2 Construt a Lagrange basis {p p p 2 p 3 } of P 3 using the x values from

More information

Need for Sampling in Machine Learning. Sargur Srihari

Need for Sampling in Machine Learning. Sargur Srihari Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

The Laws of Acceleration

The Laws of Acceleration The Laws of Aeleration The Relationships between Time, Veloity, and Rate of Aeleration Copyright 2001 Joseph A. Rybzyk Abstrat Presented is a theory in fundamental theoretial physis that establishes the

More information

Relativity in Classical Physics

Relativity in Classical Physics Relativity in Classial Physis Main Points Introdution Galilean (Newtonian) Relativity Relativity & Eletromagnetism Mihelson-Morley Experiment Introdution The theory of relativity deals with the study of

More information

CORC Report TR : Short Version Optimal Procurement Mechanisms for Divisible Goods with Capacitated Suppliers

CORC Report TR : Short Version Optimal Procurement Mechanisms for Divisible Goods with Capacitated Suppliers CORC Report TR-2006-01: Short Version Optimal Prourement Mehanisms for Divisible Goods with Capaitated Suppliers Garud Iyengar Anuj Kumar First version: June 30, 2006 This version: August 31, 2007 Abstrat

More information

u x u t Internal Waves

u x u t Internal Waves Internal Waves We now examine internal waves for the ase in whih there are two distint layers and in whih the lower layer is at rest. This is an approximation of the ase in whih the upper layer is muh

More information

QUANTUM MECHANICS II PHYS 517. Solutions to Problem Set # 1

QUANTUM MECHANICS II PHYS 517. Solutions to Problem Set # 1 QUANTUM MECHANICS II PHYS 57 Solutions to Problem Set #. The hamiltonian for a lassial harmoni osillator an be written in many different forms, suh as use ω = k/m H = p m + kx H = P + Q hω a. Find a anonial

More information

6 Dynamic Optimization in Continuous Time

6 Dynamic Optimization in Continuous Time 6 Dynami Optimization in Continuous Time 6.1 Dynami programming in ontinuous time Consider the problem Z T max e rt u (k,, t) dt (1) (t) T s.t. k ú = f (k,, t) (2) k () = k, (3) with k (T )= k (ase 1),

More information

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

Can Learning Cause Shorter Delays in Reaching Agreements?

Can Learning Cause Shorter Delays in Reaching Agreements? Can Learning Cause Shorter Delays in Reahing Agreements? Xi Weng 1 Room 34, Bldg 2, Guanghua Shool of Management, Peking University, Beijing 1871, China, 86-162767267 Abstrat This paper uses a ontinuous-time

More information