Probabilistic Graphical Models
|
|
- April Butler
- 6 years ago
- Views:
Transcription
1 Probabilisti Graphial Models David Sontag New York University Leture 12, April 19, 2012 Aknowledgement: Partially based on slides by Eri Xing at CMU and Andrew MCallum at UMass Amherst David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
2 Today: learning undireted graphial models 1 Learning MRFs a. Feature-based (log-linear) representation of MRFs b. Maximum likelihood estimation. Maximum entropy view 2 Getting around omplexity of inferene a. Using approximate inferene (e.g., TRW) within learning b. Pseudo-likelihood 3 Conditional random fields David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
3 Reall: ML estimation in Bayesian networks Maximum likelihood estimation: max θ l(θ; D), where l(θ; D) = log p(d; θ) = log p(x; θ) = log p(x i ˆx pa(i)) i ˆx pa(i) : x pa(i) =ˆx pa(i) In Bayesian networks, we have the losed form ML solution: θ ML x i x pa(i) = N x i,x pa(i) ˆx i Nˆxi,x pa(i) where N xi,x pa(i) is the number of times that the (partial) assignment x i, x pa(i) is observed in the training data We were able to estimate eah CPD independently beause the objetive deomposes by variable and parent assignment David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
4 Bad news for Markov networks The global normalization onstant Z(θ) kills deomposability: θ ML = arg max θ = arg max θ = arg max θ log p(x; θ) ( ) log φ (x ; θ) log Z(θ) ( ) log φ (x ; θ) D log Z(θ) The log-partition funtion prevents us from deomposing the objetive into a sum over terms for eah potential Solving for the parameters beomes muh more ompliated David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
5 What are the parameters? How do we parameterize φ (x ; θ)? Use a log-linear parameterization: Introdue weights w R d that are used globally For eah potential, a vetor-valued feature funtion f (x ) R d Then, φ (x ; w) = exp(w f (x )) Example: disrete-valued MRF with only edge potentials, where eah variable takes k states Let d = k 2 E, and let w i,j,xi,x j = log φ ij (x i, x j ) Let f i,j (x i, x j ) have a 1 in the dimension orresponding to (i, j, x i, x j ) and 0 elsewhere The joint distribution is in the exponential family! p(x; w) = exp{w f(x) log Z(w)}, where f (x) = f (x ) and Z(w) = x exp{ w f (x )} This formulation allows for parameter sharing David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
6 Log-likelihood for log-linear models θ ML = arg max θ = arg max w ( ) log φ (x ; θ) D log Z(θ) ( ) w f (x ) D log Z(w) = arg max w w The first term is linear in w ( ) f (x ) D log Z(w) The seond term is also a funtion of w: log Z(w) = log ( exp w ) f (x ) x David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
7 Log-likelihood for log-linear models log Z(w) = log x log Z(w) does not deompose ( exp w ) f (x ) No losed form solution; even omputing likelihood requires inferene Reall Problem 4 ( Exponential families ) from Problem Set 2. Letting f(x) = f (x ), you showed that w log Z(w) = E p(x;w) [f(x)] = E p(x ;w)[f (x )] Thus, the gradient of the log-partition funtion an be omputed by inferene, omputing marginals with respet to the urrent parameters w We also laimed that the 2nd derivative of the log-partition funtion gives the seond-order moments, i.e. 2 log Z(w) = ov[f(x)] Sine ovariane matries are always positive semi-definite, this proves that log Z(w) is onvex (so log Z(w) is onave) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
8 Solving the maximum likelihood problem in MRFs l(w; D) = 1 D w ( ) f (x ) log Z(w) First, note that the weights w are unonstrained, i.e. w R d The objetive funtion is jointly onave. Apply any onvex optimization method to learn! Can use gradient asent, stohasti gradient asent, quasi-newton methods suh as limited memory BFGS (L-BFGS) The gradient of the log-likelihood is: d l(w; D) = 1 dw k D = (f (x )) k 1 D (f (x )) k E p(x ;w)[(f (x )) k ] E p(x ;w)[(f (x )) k ] David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
9 The gradient of the log-likelihood l(w; D) = 1 (x )) k w k D (f Differene of expetations! E p(x ;w)[(f (x )) k ] Consider the earlier pairwise MRF example. This then redues to: ( ) 1 l(w; D) = 1[x i = ˆx i, x j = ˆx j ] p(ˆx i, ˆx j ; w) w i,j,ˆxi,ˆx j D Setting derivative to zero, we see that for the maximum likelihood parameters w ML, we have p(ˆx i, ˆx j ; w ML ) = 1 1[x i = ˆx i, x j = ˆx j ] D for all edges ij E and states ˆx i, ˆx j Model marginals for eah lique equal the empirial marginals! Called moment mathing, and is a property of maximum likelihood learning in exponential families David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
10 Gradient asent requires repeated marginal inferene, whih in many models is hard! We will return to this shortly. David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
11 Maximum entropy (MaxEnt) We an approah the modeling task from an entirely different point of view Suppose we know some expetations with respet to a (fully general) distribution p(x): (true) 1 p(x)f i (x), (empirial) f i (x) = α i D x Assuming that the expetations are onsistent with one another, there may exist many distributions whih satisfy them. Whih one should we selet? The most unertain or flexible one, i.e., the one with maximum entropy. This yields a new optimization problem: s.t. max H(p(x)) = p x p(x)f i (x) = α i x p(x) = 1 p(x) log p(x) (stritly onave w.r.t. p(x)) David Sontag (NYU) x Graphial Models Leture 12, April 19, / 21
12 What does the MaxEnt solution look like? To solve the MaxEnt problem, we form the Lagrangian: L = p(x) log p(x) ( ) ( ) λ i p(x)f i (x) α i µ p(x) 1 x i x x Then, taking the derivative of the Lagrangian, And setting to zero, we obtain: ( L p(x) = 1 log p(x) i λ i f i (x) µ p (x) = exp 1 µ λ i f i (x) = e 1 µ e i λ i f i (x) i From the onstraint x p(x) = 1 we obtain e1+µ = x e i λ i f i (x) = Z(λ) We onlude that the maximum entropy distribution has the form (substituting w i = λ i ) p (x) = 1 Z(w) exp( i ) w i f i (x)) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
13 Equivalene of maximum likelihood and maximum entropy Feature onstraints + MaxEnt exponential family! We have seen a ase of onvex duality: In one ase, we assume exponential family and show that ML implies model expetations must math empirial expetations In the other ase, we assume model expetations must math empirial feature ounts and show that MaxEnt implies exponential family distribution Can show that one is the dual of the other, and thus both obtain the same value of the objetive at optimality (no duality gap) Besides providing insight into the ML solution, this also gives an alternative way to (approximately) solve the learning problem David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
14 How an we get around the omplexity of inferene during learning? David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
15 Monte Carlo methods Reall the original learning objetive ( ) l(w; D) = 1 D w f (x ) log Z(w) Use any of the sampling approahes (e.g., Gibbs sampling) that we disussed in Leture 9 All we need for learning (i.e., to ompute the derivative of l(w, D)) are marginals of the distribution No need to ever estimate log Z(w) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
16 Using approximations of the log-partition funtion We an substitute the original learning objetive l(w; D) = 1 ( D w ) f (x ) log Z(w) with one that uses a tratable approximation of the log-partition funtion: l(w; D) = 1 ( D w ) f (x ) log Z(w) Reall from Leture 8 that we ame up with a onvex relaxation that provided an upper bound on the log-partition funtion, log Z(w) log Z(w) (e.g., tree-reweighted belief propagation, log-determinant relaxation) Using this, we obtain a lower bound on the learning objetive l(w; D) l(w; D) Again, to ompute the derivatives we only need pseudo-marginals from the variational inferene algorithm David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
17 Pseudo-likelihood Alternatively, an we ome up with a different objetive funtion (i.e., a different estimator) whih sueeds at learning while avoiding inferene altogether? Pseudo-likelihood method (Besag 1971) yields an exat solution if the data is generated by a model in our model family p(x; θ ) and D (i.e., it is onsistent) Note that, via the hain rule, p(x; w) = i p(x i x 1,..., x i 1 ; w) We onsider the following approximation: p(x; w) i p(x i x 1,..., x i 1, x i+1,..., x n ; w) = i p(x i x i ; w) where we have added onditioning over additional variables David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
18 Pseudo-likelihood The pseudo-likelihood method replaes the likelihood, l(θ; D) = 1 1 log p(d; θ) = D D with the following approximation: l PL (w; D) = 1 D D n m=1 i=1 (we replaed x i with x N(i), i s Markov blanket) D log p(x m ; θ) m=1 log p(x m i x m N(i) ; w) For example, suppose we have a pairwise MRF. Then, p(xi m xn(i) m ; w) = 1 Z(xN(i) m ; j N(i) θ m ij (xi,x m j ), Z(x m w)e N(i) ; w) = ˆxi e j N(i) θ ij (ˆx i,x m j ) More generally, and using the log-linear parameterization, we have: log p(xi m xn(i) m ; w) = w f (x m ) log Z(xN(i) m ; w) :i David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
19 Pseudo-likelihood This objetive only involves summation over x i and is tratable Has many small partition funtions (one for eah variable and eah setting of its neighbors) instead of one big one It is still onave in w and thus has no loal maxima Assuming the data is drawn from a MRF with parameters w, an show that as the number of data points gets large, w PL w David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
20 Conditional random fields Reall from Leture 4, a CRF is a Markov network on variables X Y, whih speifies the onditional distribution P(y x) = 1 φ (x, y ) Z(x) C with partition funtion Z(x) = ŷ φ (x, ŷ ). C The feature funtions now depend on x in addition to y For eah potential, a vetor-valued feature funtion f (x, y ) R d Then, φ (x, y ; w) = exp(w f (x, y )) David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
21 Learning with onditional random fields Exat same as learning with MRFs, exept that we have a different partition funtion for eah data point ( ) θ ML = arg max log φ (x, y ; θ) log Z(x; θ) θ (x,y) D = arg max w f (x, y ) log Z(x; w) w (x,y) D (x,y) D David Sontag (NYU) Graphial Models Leture 12, April 19, / 21
Maximum Entropy and Exponential Families
Maximum Entropy and Exponential Families April 9, 209 Abstrat The goal of this note is to derive the exponential form of probability distribution from more basi onsiderations, in partiular Entropy. It
More informationSensitivity Analysis in Markov Networks
Sensitivity Analysis in Markov Networks Hei Chan and Adnan Darwihe Computer Siene Department University of California, Los Angeles Los Angeles, CA 90095 {hei,darwihe}@s.ula.edu Abstrat This paper explores
More informationProbabilistic Graphical Models
Probabilisti Graphial Models 0-708 Undireted Graphial Models Eri Xing Leture, Ot 7, 2005 Reading: MJ-Chap. 2,4, and KF-hap5 Review: independene properties of DAGs Defn: let I l (G) be the set of loal independene
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationComplexity of Regularization RBF Networks
Complexity of Regularization RBF Networks Mark A Kon Department of Mathematis and Statistis Boston University Boston, MA 02215 mkon@buedu Leszek Plaskota Institute of Applied Mathematis University of Warsaw
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder
More informationDirected and Undirected Graphical Models
Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationCSC2515 Winter 2015 Introduc3on to Machine Learning. Lecture 5: Clustering, mixture models, and EM
CSC2515 Winter 2015 Introdu3on to Mahine Learning Leture 5: Clustering, mixture models, and EM All leture slides will be available as.pdf on the ourse website: http://www.s.toronto.edu/~urtasun/ourses/csc2515/
More informationMethods of evaluating tests
Methods of evaluating tests Let X,, 1 Xn be i.i.d. Bernoulli( p ). Then 5 j= 1 j ( 5, ) T = X Binomial p. We test 1 H : p vs. 1 1 H : p>. We saw that a LRT is 1 if t k* φ ( x ) =. otherwise (t is the observed
More informationAdvanced Computational Fluid Dynamics AA215A Lecture 4
Advaned Computational Fluid Dynamis AA5A Leture 4 Antony Jameson Winter Quarter,, Stanford, CA Abstrat Leture 4 overs analysis of the equations of gas dynamis Contents Analysis of the equations of gas
More informationmax min z i i=1 x j k s.t. j=1 x j j:i T j
AM 221: Advaned Optimization Spring 2016 Prof. Yaron Singer Leture 22 April 18th 1 Overview In this leture, we will study the pipage rounding tehnique whih is a deterministi rounding proedure that an be
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationProduct Policy in Markets with Word-of-Mouth Communication. Technical Appendix
rodut oliy in Markets with Word-of-Mouth Communiation Tehnial Appendix August 05 Miro-Model for Inreasing Awareness In the paper, we make the assumption that awareness is inreasing in ustomer type. I.e.,
More information10.5 Unsupervised Bayesian Learning
The Bayes Classifier Maximum-likelihood methods: Li Yu Hongda Mao Joan Wang parameter vetor is a fixed but unknown value Bayes methods: parameter vetor is a random variable with known prior distribution
More informationLearning MN Parameters with Approximation. Sargur Srihari
Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief
More informationHankel Optimal Model Order Reduction 1
Massahusetts Institute of Tehnology Department of Eletrial Engineering and Computer Siene 6.245: MULTIVARIABLE CONTROL SYSTEMS by A. Megretski Hankel Optimal Model Order Redution 1 This leture overs both
More informationLearning MN Parameters with Alternative Objective Functions. Sargur Srihari
Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient
More informationA NETWORK SIMPLEX ALGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM
NETWORK SIMPLEX LGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM Cen Çalışan, Utah Valley University, 800 W. University Parway, Orem, UT 84058, 801-863-6487, en.alisan@uvu.edu BSTRCT The minimum
More informationLecture 7: Sampling/Projections for Least-squares Approximation, Cont. 7 Sampling/Projections for Least-squares Approximation, Cont.
Stat60/CS94: Randomized Algorithms for Matries and Data Leture 7-09/5/013 Leture 7: Sampling/Projetions for Least-squares Approximation, Cont. Leturer: Mihael Mahoney Sribe: Mihael Mahoney Warning: these
More informationA brief introduction to Conditional Random Fields
A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood
More informationDanielle Maddix AA238 Final Project December 9, 2016
Struture and Parameter Learning in Bayesian Networks with Appliations to Prediting Breast Caner Tumor Malignany in a Lower Dimension Feature Spae Danielle Maddix AA238 Final Projet Deember 9, 2016 Abstrat
More informationSensitivity analysis for linear optimization problem with fuzzy data in the objective function
Sensitivity analysis for linear optimization problem with fuzzy data in the objetive funtion Stephan Dempe, Tatiana Starostina May 5, 2004 Abstrat Linear programming problems with fuzzy oeffiients in the
More informationA new initial search direction for nonlinear conjugate gradient method
International Journal of Mathematis Researh. ISSN 0976-5840 Volume 6, Number 2 (2014), pp. 183 190 International Researh Publiation House http://www.irphouse.om A new initial searh diretion for nonlinear
More informationThe Effectiveness of the Linear Hull Effect
The Effetiveness of the Linear Hull Effet S. Murphy Tehnial Report RHUL MA 009 9 6 Otober 009 Department of Mathematis Royal Holloway, University of London Egham, Surrey TW0 0EX, England http://www.rhul.a.uk/mathematis/tehreports
More informationWhere as discussed previously we interpret solutions to this partial differential equation in the weak sense: b
Consider the pure initial value problem for a homogeneous system of onservation laws with no soure terms in one spae dimension: Where as disussed previously we interpret solutions to this partial differential
More informationCSC 412 (Lecture 4): Undirected Graphical Models
CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:
More informationLearning Parameters of Undirected Models. Sargur Srihari
Learning Parameters of Undirected Models Sargur srihari@cedar.buffalo.edu 1 Topics Log-linear Parameterization Likelihood Function Maximum Likelihood Parameter Estimation Simple and Conjugate Gradient
More informationAverage Rate Speed Scaling
Average Rate Speed Saling Nikhil Bansal David P. Bunde Ho-Leung Chan Kirk Pruhs May 2, 2008 Abstrat Speed saling is a power management tehnique that involves dynamially hanging the speed of a proessor.
More informationTaste for variety and optimum product diversity in an open economy
Taste for variety and optimum produt diversity in an open eonomy Javier Coto-Martínez City University Paul Levine University of Surrey Otober 0, 005 María D.C. Garía-Alonso University of Kent Abstrat We
More informationLECTURE NOTES FOR , FALL 2004
LECTURE NOTES FOR 18.155, FALL 2004 83 12. Cone support and wavefront set In disussing the singular support of a tempered distibution above, notie that singsupp(u) = only implies that u C (R n ), not as
More informationError Bounds for Context Reduction and Feature Omission
Error Bounds for Context Redution and Feature Omission Eugen Bek, Ralf Shlüter, Hermann Ney,2 Human Language Tehnology and Pattern Reognition, Computer Siene Department RWTH Aahen University, Ahornstr.
More informationConditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013
Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General
More information( ) ( ) Volumetric Properties of Pure Fluids, part 4. The generic cubic equation of state:
CE304, Spring 2004 Leture 6 Volumetri roperties of ure Fluids, part 4 The generi ubi equation of state: There are many possible equations of state (and many have been proposed) that have the same general
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationREFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS. 1. Introduction
Version of 5/2/2003 To appear in Advanes in Applied Mathematis REFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS MATTHIAS BECK AND SHELEMYAHU ZACKS Abstrat We study the Frobenius problem:
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationLearning Parameters of Undirected Models. Sargur Srihari
Learning Parameters of Undirected Models Sargur srihari@cedar.buffalo.edu 1 Topics Difficulties due to Global Normalization Likelihood Function Maximum Likelihood Parameter Estimation Simple and Conjugate
More informationn n=1 (air) n 1 sin 2 r =
Physis 55 Fall 7 Homework Assignment #11 Solutions Textbook problems: Ch. 7: 7.3, 7.4, 7.6, 7.8 7.3 Two plane semi-infinite slabs of the same uniform, isotropi, nonpermeable, lossless dieletri with index
More informationOptimization of Statistical Decisions for Age Replacement Problems via a New Pivotal Quantity Averaging Approach
Amerian Journal of heoretial and Applied tatistis 6; 5(-): -8 Published online January 7, 6 (http://www.sienepublishinggroup.om/j/ajtas) doi:.648/j.ajtas.s.65.4 IN: 36-8999 (Print); IN: 36-96 (Online)
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationChapter 8 Hypothesis Testing
Leture 5 for BST 63: Statistial Theory II Kui Zhang, Spring Chapter 8 Hypothesis Testing Setion 8 Introdution Definition 8 A hypothesis is a statement about a population parameter Definition 8 The two
More informationTime Domain Method of Moments
Time Domain Method of Moments Massahusetts Institute of Tehnology 6.635 leture notes 1 Introdution The Method of Moments (MoM) introdued in the previous leture is widely used for solving integral equations
More informationThe shape of a hanging chain. a project in the calculus of variations
The shape of a hanging hain a projet in the alulus of variations April 15, 218 2 Contents 1 Introdution 5 2 Analysis 7 2.1 Model............................... 7 2.2 Extremal graphs.........................
More informationControl Theory association of mathematics and engineering
Control Theory assoiation of mathematis and engineering Wojieh Mitkowski Krzysztof Oprzedkiewiz Department of Automatis AGH Univ. of Siene & Tehnology, Craow, Poland, Abstrat In this paper a methodology
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationResearch Collection. Mismatched decoding for the relay channel. Conference Paper. ETH Library. Author(s): Hucher, Charlotte; Sadeghi, Parastoo
Researh Colletion Conferene Paper Mismathed deoding for the relay hannel Author(s): Huher, Charlotte; Sadeghi, Parastoo Publiation Date: 2010 Permanent Link: https://doi.org/10.3929/ethz-a-005997152 Rights
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationWhen p = 1, the solution is indeterminate, but we get the correct answer in the limit.
The Mathematia Journal Gambler s Ruin and First Passage Time Jan Vrbik We investigate the lassial problem of a gambler repeatedly betting $1 on the flip of a potentially biased oin until he either loses
More informationSOA/CAS MAY 2003 COURSE 1 EXAM SOLUTIONS
SOA/CAS MAY 2003 COURSE 1 EXAM SOLUTIONS Prepared by S. Broverman e-mail 2brove@rogers.om website http://members.rogers.om/2brove 1. We identify the following events:. - wathed gymnastis, ) - wathed baseball,
More informationModel-based mixture discriminant analysis an experimental study
Model-based mixture disriminant analysis an experimental study Zohar Halbe and Mayer Aladjem Department of Eletrial and Computer Engineering, Ben-Gurion University of the Negev P.O.Box 653, Beer-Sheva,
More informationSolutions to Problem Set 1
Eon602: Maro Theory Eonomis, HKU Instrutor: Dr. Yulei Luo September 208 Solutions to Problem Set. [0 points] Consider the following lifetime optimal onsumption-saving problem: v (a 0 ) max f;a t+ g t t
More informationSingular Event Detection
Singular Event Detetion Rafael S. Garía Eletrial Engineering University of Puerto Rio at Mayagüez Rafael.Garia@ee.uprm.edu Faulty Mentor: S. Shankar Sastry Researh Supervisor: Jonathan Sprinkle Graduate
More informationCase I: 2 users In case of 2 users, the probability of error for user 1 was earlier derived to be 2 A1
MUTLIUSER DETECTION (Letures 9 and 0) 6:33:546 Wireless Communiations Tehnologies Instrutor: Dr. Narayan Mandayam Summary By Shweta Shrivastava (shwetash@winlab.rutgers.edu) bstrat This artile ontinues
More informationLecture 8: PGM Inference
15 September 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I 1 Variable elimination Max-product Sum-product 2 LP Relaxations QP Relaxations 3 Marginal and MAP X1 X2 X3 X4
More informationA Queueing Model for Call Blending in Call Centers
A Queueing Model for Call Blending in Call Centers Sandjai Bhulai and Ger Koole Vrije Universiteit Amsterdam Faulty of Sienes De Boelelaan 1081a 1081 HV Amsterdam The Netherlands E-mail: {sbhulai, koole}@s.vu.nl
More information15.12 Applications of Suffix Trees
248 Algorithms in Bioinformatis II, SoSe 07, ZBIT, D. Huson, May 14, 2007 15.12 Appliations of Suffix Trees 1. Searhing for exat patterns 2. Minimal unique substrings 3. Maximum unique mathes 4. Maximum
More informationCS 687 Jana Kosecka. Uncertainty, Bayesian Networks Chapter 13, Russell and Norvig Chapter 14,
CS 687 Jana Koseka Unertainty Bayesian Networks Chapter 13 Russell and Norvig Chapter 14 14.1-14.3 Outline Unertainty robability Syntax and Semantis Inferene Independene and Bayes' Rule Syntax Basi element:
More informationAnalysis of discretization in the direct simulation Monte Carlo
PHYSICS OF FLUIDS VOLUME 1, UMBER 1 OCTOBER Analysis of disretization in the diret simulation Monte Carlo iolas G. Hadjionstantinou a) Department of Mehanial Engineering, Massahusetts Institute of Tehnology,
More informationAlternative Parameterizations of Markov Networks. Sargur Srihari
Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions
More informationUndirected graphical models
Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical
More informationTests of fit for symmetric variance gamma distributions
Tests of fit for symmetri variane gamma distributions Fragiadakis Kostas UADPhilEon, National and Kapodistrian University of Athens, 4 Euripidou Street, 05 59 Athens, Greee. Keywords: Variane Gamma Distribution,
More informationStudy of EM waves in Periodic Structures (mathematical details)
Study of EM waves in Periodi Strutures (mathematial details) Massahusetts Institute of Tehnology 6.635 partial leture notes 1 Introdution: periodi media nomenlature 1. The spae domain is defined by a basis,(a
More informationSubject: Introduction to Component Matching and Off-Design Operation % % ( (1) R T % (
16.50 Leture 0 Subjet: Introdution to Component Mathing and Off-Design Operation At this point it is well to reflet on whih of the many parameters we have introdued (like M, τ, τ t, ϑ t, f, et.) are free
More information2.2 BUDGET-CONSTRAINED CHOICE WITH TWO COMMODITIES
Essential Miroeonomis -- 22 BUDGET-CONSTRAINED CHOICE WITH TWO COMMODITIES Continuity of demand 2 Inome effets 6 Quasi-linear, Cobb-Douglas and CES referenes 9 Eenditure funtion 4 Substitution effets and
More informationELECTROMAGNETIC WAVES
ELECTROMAGNETIC WAVES Now we will study eletromagneti waves in vauum or inside a medium, a dieletri. (A metalli system an also be represented as a dieletri but is more ompliated due to damping or attenuation
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationLecture 3 - Lorentz Transformations
Leture - Lorentz Transformations A Puzzle... Example A ruler is positioned perpendiular to a wall. A stik of length L flies by at speed v. It travels in front of the ruler, so that it obsures part of the
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationLecture 13 Bragg-Williams Theory
Leture 13 Bragg-Williams Theory As noted in Chapter 11, an alternative mean-field approah is to derive a free energy, F, in terms of our order parameter,m, and then minimize F with respet to m. We begin
More informationTransformation to approximate independence for locally stationary Gaussian processes
ransformation to approximate independene for loally stationary Gaussian proesses Joseph Guinness, Mihael L. Stein We provide new approximations for the likelihood of a time series under the loally stationary
More informationLogistic Regression: Online, Lazy, Kernelized, Sequential, etc.
Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010
More informationDirectional Coupler. 4-port Network
Diretional Coupler 4-port Network 3 4 A diretional oupler is a 4-port network exhibiting: All ports mathed on the referene load (i.e. S =S =S 33 =S 44 =0) Two pair of ports unoupled (i.e. the orresponding
More informationCommon Value Auctions with Costly Entry
Common Value Autions with Costly Entry Pauli Murto Juuso Välimäki June, 205 preliminary and inomplete Abstrat We onsider a model where potential bidders onsider paying an entry ost to partiipate in an
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationEnergy Based Models. Stefano Ermon, Aditya Grover. Stanford University. Lecture 13
Energy Based Models Stefano Ermon, Aditya Grover Stanford University Lecture 13 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 13 1 / 21 Summary Story so far Representation: Latent
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationEstimation for Unknown Parameters of the Extended Burr Type-XII Distribution Based on Type-I Hybrid Progressive Censoring Scheme
Estimation for Unknown Parameters of the Extended Burr Type-XII Distribution Based on Type-I Hybrid Progressive Censoring Sheme M.M. Amein Department of Mathematis Statistis, Faulty of Siene, Taif University,
More informationSome facts you should know that would be convenient when evaluating a limit:
Some fats you should know that would be onvenient when evaluating a it: When evaluating a it of fration of two funtions, f(x) x a g(x) If f and g are both ontinuous inside an open interval that ontains
More information2 The Bayesian Perspective of Distributions Viewed as Information
A PRIMER ON BAYESIAN INFERENCE For the next few assignments, we are going to fous on the Bayesian way of thinking and learn how a Bayesian approahes the problem of statistial modeling and inferene. The
More informationErrata and changes for Lecture Note 1 (I would like to thank Tomasz Sulka for the following changes): ( ) ( ) lim = should be
Errata and hanges for Leture Note (I would like to thank Tomasz Sulka for the following hanges): Page 5 of LN: f f ' lim should be g g' f f ' lim lim g g ' Page 8 of LN: the following words (in RED) have
More informationSPLINE ESTIMATION OF SINGLE-INDEX MODELS
SPLINE ESIMAION OF SINGLE-INDEX MODELS Li Wang and Lijian Yang University of Georgia and Mihigan State University Supplementary Material his note ontains proofs for the main results he following two propositions
More informationCMSC 451: Lecture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017
CMSC 451: Leture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017 Reading: Chapt 11 of KT and Set 54 of DPV Set Cover: An important lass of optimization problems involves overing a ertain domain,
More information22.54 Neutron Interactions and Applications (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E')
22.54 Neutron Interations and Appliations (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E') Referenes -- J. R. Lamarsh, Introdution to Nulear Reator Theory (Addison-Wesley, Reading, 1966),
More information36-720: Log-Linear Models: Three-Way Tables
36-720: Log-Linear Models: Three-Way Tables Brian Junker September 10, 2007 Hierarhial Speifiation of Log-Linear Models Higher-Dimensional Tables Digression: Why Poisson If We Believe Multinomial? Produt
More informationON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION. Dobromir P. Kralchev
Pliska Stud. Math. Bulgar. 8 2007, 83 94 STUDIA MATHEMATICA BULGARICA ON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION Dobromir P. Kralhev Consider the probability that the Brownian motion
More informationStochastic Combinatorial Optimization with Risk Evdokia Nikolova
Computer Siene and Artifiial Intelligene Laboratory Tehnial Report MIT-CSAIL-TR-2008-055 September 13, 2008 Stohasti Combinatorial Optimization with Risk Evdokia Nikolova massahusetts institute of tehnology,
More informationSome recent developments in probability distributions
Proeedings 59th ISI World Statistis Congress, 25-30 August 2013, Hong Kong (Session STS084) p.2873 Some reent developments in probability distributions Felix Famoye *1, Carl Lee 1, and Ayman Alzaatreh
More informationAn iterative least-square method suitable for solving large sparse matrices
An iteratie least-square method suitable for soling large sparse matries By I. M. Khabaza The purpose of this paper is to report on the results of numerial experiments with an iteratie least-square method
More informationTutorial 4 (week 4) Solutions
THE UNIVERSITY OF SYDNEY PURE MATHEMATICS Summer Shool Math26 28 Tutorial week s You are given the following data points: x 2 y 2 Construt a Lagrange basis {p p p 2 p 3 } of P 3 using the x values from
More informationNeed for Sampling in Machine Learning. Sargur Srihari
Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,
More informationMCMC and Gibbs Sampling. Kayhan Batmanghelich
MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction
More informationThe Laws of Acceleration
The Laws of Aeleration The Relationships between Time, Veloity, and Rate of Aeleration Copyright 2001 Joseph A. Rybzyk Abstrat Presented is a theory in fundamental theoretial physis that establishes the
More informationRelativity in Classical Physics
Relativity in Classial Physis Main Points Introdution Galilean (Newtonian) Relativity Relativity & Eletromagnetism Mihelson-Morley Experiment Introdution The theory of relativity deals with the study of
More informationCORC Report TR : Short Version Optimal Procurement Mechanisms for Divisible Goods with Capacitated Suppliers
CORC Report TR-2006-01: Short Version Optimal Prourement Mehanisms for Divisible Goods with Capaitated Suppliers Garud Iyengar Anuj Kumar First version: June 30, 2006 This version: August 31, 2007 Abstrat
More informationu x u t Internal Waves
Internal Waves We now examine internal waves for the ase in whih there are two distint layers and in whih the lower layer is at rest. This is an approximation of the ase in whih the upper layer is muh
More informationQUANTUM MECHANICS II PHYS 517. Solutions to Problem Set # 1
QUANTUM MECHANICS II PHYS 57 Solutions to Problem Set #. The hamiltonian for a lassial harmoni osillator an be written in many different forms, suh as use ω = k/m H = p m + kx H = P + Q hω a. Find a anonial
More information6 Dynamic Optimization in Continuous Time
6 Dynami Optimization in Continuous Time 6.1 Dynami programming in ontinuous time Consider the problem Z T max e rt u (k,, t) dt (1) (t) T s.t. k ú = f (k,, t) (2) k () = k, (3) with k (T )= k (ase 1),
More informationDual Decomposition for Inference
Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for
More informationCan Learning Cause Shorter Delays in Reaching Agreements?
Can Learning Cause Shorter Delays in Reahing Agreements? Xi Weng 1 Room 34, Bldg 2, Guanghua Shool of Management, Peking University, Beijing 1871, China, 86-162767267 Abstrat This paper uses a ontinuous-time
More information