Exact Bayesian Pairwise Preference Learning and Inference on the Uniform Convex Polytope

Size: px
Start display at page:

Download "Exact Bayesian Pairwise Preference Learning and Inference on the Uniform Convex Polytope"

Transcription

1 Exact Bayesian Pairise Preference Learning and Inference on the Uniform Convex Polytope Scott Sanner NICTA & ANU Canberra, ACT, Australia Ehsan Abbasnejad ANU & NICTA Canberra, ACT, Australia Abstract In Bayesian approaches to utility learning from preferences, the objective is to infer a posterior belief distribution over an agent s utility function based on previously observed agent preferences. From this, one can then estimate quantities such as the expected utility of a decision or the probability of an unobserved preference, hich can then be used to make or suggest future decisions on behalf of the agent. Hoever, there remains an open question as to ho one can represent beliefs over agent utilities, perform Bayesian updating based on observed agent pairise preferences, and make inferences ith this posterior distribution in an exact, closed-form. In this paper, e build on Bayesian pairise preference learning models under the assumptions of linearly additive multi-attribute utility functions and a bounded uniform utility prior. These assumptions lead to a posterior form that is a uniform distribution over a convex polytope for hich e then demonstrate ho to perform exact, closed-form inference.r.t. this posterior, i.e., ithout resorting to sampling or other approximation methods. 1 Introduction Decision support systems DSSs) require knoledge of an agent s preferences in order to make or suggest future decisions on their behalf. Clearly then, the most crucial task of a DSS is in learning to predict future agent preferences from previously observed agent preferences. In recent years, an increasingly popular approach to preference learning relies on Bayesian updating of posterior beliefs in an underlying multi-attribute) utility function conditioned on observed agent preferences [4, 5, 3, 6, 8, 1]; inference can then be performed by marginalizing over these posterior utility beliefs.r.t. future preference queries. This Bayesian preference learning BPL) is attractive for a multitude of reasons, among them a) the generally acknoledged robustness of Bayesian learning methods to overfitting, b) the fact that updates can be done online, and c) the fact that Bayesian methods provide a useful measure of confidence in their predictions via a probability distribution over predicted outcomes. Despite the attractiveness of BPL, e are not aare of a Bayesian pairise preference learning BPPL) approach that claims to do all learning and inference in an exact, closed-form. Arguably the to most common priors for BPPL are either Gaussian or Uniform in the former case, a Gaussian prior is not conjugate to commonly used pairise preference likelihoods leading to the need for approximate Bayesian updating in order to maintain a tractable form for the posterior [4, 3, 6, 8], in the latter case, certain Uniform distributions are indeed conjugate to commonly used pairise preference likelihoods, but inference ith the posterior convex polytope requires difficult integrals that in practice are approximated via sampling or other methods [5, 1]. In this paper, e argue that the exact integrals in the case of BPPL ith commonly used linearly additive multi-attribute utility functions and a uniform utility prior are not as computationally difficult 1

2 as they may appear. Indeed, armed ith symbolic inference for pieceise functions and efficient data structures derived from algebraic decision diagram ADD) for representing and performing operations on these functions, e sho that such inference can be computed efficiently for reasonably sized problems in practice. This is the first time to the best of the authors knoledge) that an explicit algorithm has been provided for doing BPPL and all associated inference in exact, closed-form. Bayesian Pairise Preference Learning BPPL) Here e follo much of the BPPL setup of [8], except for the case of a Uniform rather than Gaussian prior distribution. In multi-attribute utility theory MAUT) [9], utilities are modeled over a D-dimensional attribute vector x here each attribute choice x d 1 d D) is either binary x d {0 false), 1 true)}) or real-valued x d R). An item is described by its attribute choice assignments x = x 1,..., x D ). For example, let our item space consist of possible cars e might buy, let binary attribute choice x 1 represent hether the car is a -door if false, e assume it is a 4- door) and let x represent the amount of engine displacement in liters. Perhaps to of our available items are a.4l to-door Accord ith attribute vector x a = 1,.4) and a 3.5L four-door Buick ith attribute vector x b = 0, 3.5). Our goal in BPPL ill be to learn an attribute eight vector = R D that describes the utility of each attribute choice in each attribute dimension. As commonly done in MAUT [9, 1], e assume that the overall utility ux ) of item x.r.t. attribute eight vector decomposes additively over the attribute choices of x, i.e., D ux ) = d x d. 1) We let u represent the user s true utility.r.t. their true but hidden). Since is unknon to the decision support system, e take a Bayesian perspective on learning [4] and thus maintain a probability distribution P ) representing our beliefs over. For this, e must define our prior utility beliefs, likelihood preference query response model), the form of the posterior for our Bayesian update, and the form of any inferences e ish to make ith the posterior..1 Utility Prior For our prior beliefs P ), e start ith a hyper-rectangular multivariate uniform distribution over represented as follos P ) = D p d ) = D U d ; C, C). ) here C R C 0) is some constant 1 and the uniform distribution U is simply defined using an indicator function : U d ; L, U) = 1 U L I[ d L) d U)]. By using respective loer and upper bounds of C and C for each uniform dimension of the prior, e enforce that E P ) [] = 0 indicating agnostic prior beliefs about the utility of each attribute. As it ill be important observation for future sections, e remark that such a hyperrectangular uniform prior as defined here can also be conveniently vieed as the intersection product) of halfspaces given by the indicator definition of U.. Query & User Response Model In this paper, e focus on pairise comparison queries knon to require lo cognitive load for users to anser [7]. We use Q ab = {a b, a b, a b} to represent a pairise comparison query 1 In practice, C could be different for each dimension, if justified. We use I[ ] as an indicator function taking the value 1 hen its argument is true and 0 otherise.

3 indicating the user s preferences of item x a vs. item x b henceforth just a and b). Depending on the user s attribute eight vector and the corresponding item utilities, ua ) and ub ), the user s response q ab Q ab indicates the folloing: a b: the user prefers a to b, a b: the user prefers a to b, a b: the user is indifferent beteen a and b. We represent the user query model in 5) ith an indicator function over the pairise utility differences as follos P Q ab = a b ) = I[ua ) ub ) > ɛ] P Q ab = a b ) = I[ub ) ua ) > ɛ] P Q ab = a b ) = I[ ua ) ub ) ɛ], 3) here e can modulate the range of utility differences for hich the user is indifferent by adjusting ɛ. Note that by definition, q ab P q ab ) = 1. If needed, e could easily model more noise in the elicitation process simply by alloing non-zero probability for eight vectors that disagree ith the query response. For example, here e model a 10% chance that a user s query response disagrees ith their utility valuation of items a and b by more than ɛ: P Q ab = a b ) = 0.9 I[ua ) ub ) > ɛ] I[ua ) ub )] ɛ. 4) Whatever exact form of the above query response model is used, e note that due to the linearity of u ) in, all of the above query response likelihood models can be represented by a sum of) linearly separated half-spaces oing to the the linear constraints in the likelihood indicator functions..3 Bayesian Utility Updating Given a prior utility belief P R n ).r.t. a possibly empty) set of n 0 query responses R n = {q kl } and a ne query response q ab, e perform the folloing Bayesian update to obtain a posterior belief P R n+1 ) here R n+1 = R n {q ab }: P R n+1 ) P q ab, R n )P R n ) P q ab )P R n ). 5) Assuming that our query likelihood P q ab ) is modeled as described in 3) [or 4)], e note that since the prior is a product of linear half-spaces and the likelihood also [a sum of] linear half-spaces and these are multiplied, the resulting posterior is a uniform convex polytope 3 [or after placing the result in sum of products form, a sum of uniform convex polytopes]. Hence, an unnormalized Bayesian update appears to be computationally straightforard and in fact a very efficient OnD) complexity for Bayesian updating for the case of the likelihood model in 3). In Figure 1, e sho a D = representation of a posterior for a likelihood in the form of 3).4 Preference Inference No that e have posterior beliefs in our utility eighting P R n ), e ould like to perform at least to types of inference: Item ith maximum expected utility: arg max x = arg max x E [ ux ) R n ] [ D ] P R n ) d x d d 3 Here an unnormalized uniform distribution on the convex polytope defined by the product of linear constraints in the indicator functions. 3

4 q ab q cd 1 Figure 1: A representation of the uniform convex polytope representing the posterior distribution on hen using the likelihood model in 3). The union of the striped and cross-hatched area shos the non-zero posterior polytope before updating ith query response q ab and the cross-hatched area shos the non-zero posterior polytope after updating ith query response q ab. We note that query response q cd ould not lead to any change in either posterior before or after updating ith q cd and its constraint could be pruned from the constraint set. Probability of preference outcomes: P q ab R n ) = P q ab )P R n )d If using likelihood form 3), e note that both inferences are volume integrals over a function ux ) in the first case and constant in the latter case) subject to boundary constraints over a convex polytope defined.r.t. P R n ) appearing in both inferences) and P q ab ) only for the second inference). In the case of likelihood form 3), e simply get a sum of such integrals. Consequently, the question e have to ask to perform exact inference in this Uniform BPPL setting is the folloing: given a function defined over the region of a convex polytope like the one in Figure 1, ho do e compute this integral in closed-form? At first, this ould seem quite difficult the boundaries in any dimension in Figure 1 may be a pieceise function of the other dimensions but on closer inspection, there is a systematic ay to compute these integrals if given the right representation. We discuss this next. 3 Symbolic Variable Elimination in Discrete/Continuous Graphical Models In this section, e introduce a case notation and operations for pieceise polynomial functions and demonstrate that all Uniform BPPL operations, including the required integrals can be computed exactly and in closed-form using a purely symbolic representation. 3.1 Case Representation and Operators For this section, e ill assume all functions are represented in case form [11] as follos: φ 1 f 1 f =.... 6) φ k f k Here, the f i may be polynomial functions of although initially the functions f i ill be constant or linear, e ill see that integration ill introduce higher order polynomials for f i ). The φ i are logical formulae defined over that can consist of arbitrary logical combinations of inequalities, >,, <) over linear functions of. We assume that the set of conditions {φ 1,..., φ k } disjointly and exhaustively partition such that f is ell-defined for all R D. It is easy to verify that such a representation can represent all of the functions discussed in Section.3. Unary operations such as scalar multiplication c f for some constant c R) or negation f on case statements f are straightforard; the unary operation is simply applied to each f i 1 i k). 4

5 Intuitively, to perform a binary operation on to case statements, e simply take the cross-product of the logical partitions of each case statement and perform the corresponding operation on the resulting paired partitions. Thus, e perform the cross-sum of to unnamed) cases as follos: φ 1 : f 1 φ : f ψ 1 : g 1 ψ : g = 8 φ 1 ψ 1 : f 1 + g 1 >< φ 1 ψ : f 1 + g φ ψ 1 : f + g 1 >: φ ψ : f + g Likeise, e perform and by, respectively, subtracting or multiplying partition values rather than adding them) to obtain the result. Some partitions resulting from the application of the,, and operators may be inconsistent infeasible); if e can detect this e.g., via a linear constraint solver), e may simply discard such partitions as they are irrelevant to the function value. For the Uniform BPPL inference defined in Section.4, e ll need to compute definite integrals a fairly non-trivial operation that is discussed in its on section. But first e discuss maximization needed for orking ith integral bounds. Symbolic maximization is fairly straightforard to define: max 8 φ 1 ψ 1 f 1 > g 1 : f 1 φ 1 ψ 1 f 1 g 1 : g 1! φ 1 ψ f 1 > g : f 1 φ 1 : f 1 ψ 1 : g >< 1 φ 1 ψ f 1 g : g, = φ : f ψ : g φ ψ 1 f > g 1 : f φ ψ 1 f g 1 : g 1 φ >: ψ f > g : f φ ψ f g : g The key observation here is that case statements are closed under the max operation. While it may appear that this representation ill lead to an unreasonable bloup in size, e note the XADD that e introduce later in Section 4 ill be able to exploit the internal decision structure of this maximization to represent it much more compactly. 3. Integration on the Convex Polytope We have almost all of the operations needed to perform Uniform BPPL inference as defined in Section.4, except for a method to compute each of the integrals 1,..., D.r.t. the convex polytope constraints such as those shon in Figure 1. We provide this integration technique here. If e are computing 1= f d 1 for f in 6), e can rerite it in the folloing equivalent form 1= I[φ i ] f i d 1 = i i 1= I[φ i ] f i d 1. 7) Hence e can compute the integrals separately for each case partition producing a case statement) and then the results using. To continue ith the integral for a single case partition, e introduce a concrete example. Let f 1 := 3 1 and φ 1 := [ 1 > 1] [ 1 > 1] [ 1 ] [ 1 +1] [ > 0]. In computing I[φ 1= 1] f 1 d 1, the first thing e note is that the constraints involving 1 in I[φ 1 ] can be used to restrict the integration range for 1. In fact, e kno that the integrand is only non-zero for max 1, 1) 1 min, + 1). Using the max and corresponding min operations defined previously, let us rite explicit functions for these bounds in pieceise linear case form for the respective loer and upper bounds LB and UB: 1 > 1 : 1 LB := 1 1 : 1 UB := < + 1 : + 1 : + 1 = 1 : < 1 : + 1 5

6 f b *x + z <= 10 *x + z <= 10 y <= 4 y >= -6 y - x >= - y >= -6 y <= 4 y <= 4 *x + z <= 10 *x + z <= 10 x + z y - x >= - y >= -6 y - x >= y - x >= - y - x >= - y - x >= - y - x >= - y - x >= - y - x <= 3 y <= x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z Figure : Left: a tree representation of a case statement: each branch corresponds to a single case partition. Right: minimal XADD representation of the same case statement, note the compact DAG structure. No e can rerite the integral as 4 I[ > 0] UB 1=LB 3 1 ) d 1. 8) Note here that I[ > 0] is independent of 1 and hence can factor outside the integral. With all indicator functions moved into the LB or UB or factored out, e can no compute the integral: [ ] 3 1=UB I[ > 0] ) 1=LB The question no is simply ho to do this evaluation? Here e note that every expression variables, constants, indicator functions, etc.) can be ritten as a simple case statements or as operations on case statements, even the upper and loer bounds as shon previously. So the evaluation is simply: I[ > 0]» 3 UB UB UB» 3 LB LB LB Hence the result of the definite integration over a case partition of a polynomial function ith pieceise linear constraints is a case statement ith pieceise linear constraints over polynomial functions this is somehat remarkable given that all of the bound computations ere symbolic. Hoever, e are not yet done, there is one final step that e must include for correctness. Because our bounds are symbolic, it may be the case for some assignment to b, ) that LB UB this occurs in infeasible portions of the convex polytope). In this case the integral should be zero since the constaints on 1 could not be jointly satisfied. To enforce this symbolically, e simply need to 10) by case statements representing the folloing inequalities for all pairs of upper and loer bounds: I[ > 1] I[ > 1] I[ + 1 > 1] I[ + 1 > 1] 11) Of course, here I[ + 1 > 1] could be removed as a tautology, but not the other constraints. This provides the solution for a single case partition and from 7), e just need to the case statements from each case partition integral evaluation to obtain the full integral, still in case form. Repeating this for all integrals 1,..., D, e can obtain closed-form exact results for all queries in Section.4! 4 Extended ADDs XADDs) for Case Statements In practice, it can be prohibitively expensive to maintain a case statement representation of a value function ith explicit partitions. Motivated by algebraic decision diagrams ADDs) [], hich maintain compact representations for finite discrete functions, e use an extended ADD XADD) formalism introduced in [11] and demonstrated in Figure right). 4 The careful reader ill note that because the loer bounds ere defined in terms of > rather than, e technically have an improper integral and need to take a limit. Hoever, in taking the limit, e note that all integrands are continuous polynomials of order 0 or greater, so the limit exists and yields the same anser as substituting the limit value. Hence for polynomials, e need not be concerned about hether bounds are inclusive or not. 10) 6

7 x 10 5 Elapsed time ms) Constraints Dimension 4 Figure 3: The performance of expected utility computation of 1) in terms of time in ms ith various dimensionality utility attributes) and number of constraints query responses). In brief e note that an XADD is like an algebraic decision diagram ADD) [] except that a) the decision nodes can have arbitrary inequalities, equalities, or disequalities one per node) and b) the leaf nodes can represent arbitrary functions. The decision nodes still have a fixed order from root to leaf and the standard ADD operations to build a canonical ADD REDUCE) and to perform a binary operation on to ADDs APPLY) still applies in the case of XADDs. Of particular importance is that the XADD is a directed acyclic graph DAG), and hence is often much more compact than a tree representation. For example, in Figure left) e provide the function represented by the XADD in Figure right) expanded from a DAG into it s full tree form. We note that not only is the XADD much more compact on account of its DAG, but that all unary and binary operations on XADDs can directly exploit this compact structure for efficiency. For more compactness and hence efficiency, e can apply a linear constraint solver to prune out unreachable branches of the convex polytope so that for example, dominated polytope constraints for preference q cd in Figure 1 can simply be pruned from the XADD. To this end, e can use the XADD to perform all case operations required for inference in Section.3 and hence efficiently perform the Uniform BPPL queries from Section.4 as e demonstrate next. 5 Empirical Results In this section e demonstrate that performing the inference from Section.4 is efficient in most cases for a reasonable number of attribute dimensions and pairise query response observations. In Figure 3, e illustrate ho the algorithm performs as the number of attributes D i.e., dimensionality) and the size n) of the query response set R n i.e., number of constraints) vary. To run this experiment, e randomly generated an attribute eighting vector of dimensionality D as ell as n query responses R n ensuring all responses maintained a feasible utility region), performed Bayesian updating under the likelihood model of 3), and then carried out the expected utility subquery from Section.4: [ E ux ) R n ] = arg max P R n ) x [ D ] d x d d 1) We recorded the total time of this inference as a function of the dimensionality D and number of constraints n. As e can observe in Figure 3, in general the expected utility inference runs ithin a reasonable time for most cases the flat surface is not zero, but rather on the order of 100 ms or less). Curiously, it is only for a certain combination of constraints and dimensionality that the algorithm fails to run fast enough for real-time performance empirically, it seems that over-constrained and underconstrained problems for loer dimensions are easiest for inference. It remains to explain exactly hy this happens intuitively, one ould have thought the smaller the number of constaints, the easier the pieceise integrals can be evaluated. We are currently investigating this phenomena in more detail. 7

8 6 Conclusions and Future Work This ork presented novel and efficient, exact Bayesian learning and inference methods for preference elicitation on the Uniform distribution on the convex polytope. In future ork, e aim to investigate the folloing possible extensions: Generalized additive independence [9, 1]: can e introduce additive independence over joint utility factors rather than individual utility attributes hile still achieving efficient computation? In principle, this seems relatively straightforard. Nonlinear utility functions: under hat conditions can e use nonlinear utility functions and still compute the inferences from Section.4? Multi-user utility inference: ho can e extend the inference in this paper to leverage query responses from multiple users in a Bayesian collaborative filtering [10] frameork? References [1] Fahiem Bacchus and Adam Grove. Graphical models for preference and utility. In Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference 1995), pages 3 10, San Francisco, Morgan Kaufmann Publishers. [] R. Iris Bahar, Erica Frohm, Charles Gaona, Gary Hachtel, Enrico Macii, Abelardo Pardo, and Fabio Somenzi. Algebraic Decision Diagrams and their applications. In IEEE /ACM International Conference on CAD, [3] Craig Boutilier. A POMDP formulation of preference elicitation problems. In Eighteenth National Conference on Artificial Intelligence, pages 39 46, Menlo Park, CA, USA, 00. American Association for Artificial Intelligence. [4] Urszula Chajeska and Daphne Koller. Utilities as random variables: Density estimation and structure discovery. In In UAI, pages 63 71, 000. [5] Urszula Chajeska, Daphne Koller, and Dirk Ormoneit. Learning an agent s utility function by observing behavior. In In Proc. of the 18th Intl Conf. on Machine Learning, pages 35 4, 001. [6] Wei Chu and Zoubin Ghahramani. Preference learning ith gaussian processes. In ICML, pages , 005. [7] Vincent Conitzer. Eliciting single-peaked preferences using comparison queries. Journal of Artificial Intelligence Research, 35: , 009. [8] S. Guo and S. Sanner. Real-time multiattribute Bayesian preference elicitation ith pairise comparison queries. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics AISTATS-10), volume 9, pages 89 96, Sardinia, Italy, May [9] R.L. Keeney and H. Raiffa. Decisions ith multiple objectives: Preferences and value tradeoffs. J. Wiley, Ne York, [10] Paul Resnick and Hal R. Varian. Recommender systems. Communications of the ACM, 40:56 58, March [11] Scott Sanenr, Karina Valdivia Delgado, and Leliane Nunes de Barros. Symbolic dynamic programming for discrete and continuous state mdps. In UAI-011, Barcelona, 011. [1] Paolo Viappiani and Craig Boutilier. Optimal bayesian recommendation sets and myopically optimal choice query sets. In Advances in Neural Information Processing Systems 3 NIPS). MIT press,

Data Structures for Efficient Inference and Optimization

Data Structures for Efficient Inference and Optimization Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete

More information

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

Linear Time Computation of Moments in Sum-Product Networks

Linear Time Computation of Moments in Sum-Product Networks Linear Time Computation of Moments in Sum-Product Netorks Han Zhao Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213 han.zhao@cs.cmu.edu Geoff Gordon Machine Learning Department

More information

Symbolic Dynamic Programming for Discrete and Continuous State MDPs

Symbolic Dynamic Programming for Discrete and Continuous State MDPs Symbolic Dynamic Programming for Discrete and Continuous State MDPs Scott Sanner NICTA & the ANU Canberra, Australia ssanner@nictacomau Karina Valdivia Delgado University of Sao Paulo Sao Paulo, Brazil

More information

Real-time Multiattribute Bayesian Preference Elicitation with Pairwise Comparison Queries

Real-time Multiattribute Bayesian Preference Elicitation with Pairwise Comparison Queries eal-time Multiattribute Bayesian Preference Elicitation with Pairwise Comparison Queries Shengbo Guo Australian National University - NICTA Scott Sanner NICTA - Australian National University Abstract

More information

Max-Margin Ratio Machine

Max-Margin Ratio Machine JMLR: Workshop and Conference Proceedings 25:1 13, 2012 Asian Conference on Machine Learning Max-Margin Ratio Machine Suicheng Gu and Yuhong Guo Department of Computer and Information Sciences Temple University,

More information

Linear-time Gibbs Sampling in Piecewise Graphical Models

Linear-time Gibbs Sampling in Piecewise Graphical Models Linear-time Gibbs Sampling in Piecewise Graphical Models Hadi Mohasel Afshar ANU & NICTA Canberra, Australia hadiafshar@anueduau Scott Sanner NICTA & ANU Canberra, Australia ssanner@nictacomau Ehsan Abbasnejad

More information

CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum

CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum 1997 19 CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION 3.0. Introduction

More information

Randomized Smoothing Networks

Randomized Smoothing Networks Randomized Smoothing Netorks Maurice Herlihy Computer Science Dept., Bron University, Providence, RI, USA Srikanta Tirthapura Dept. of Electrical and Computer Engg., Ioa State University, Ames, IA, USA

More information

Chapter 3. Systems of Linear Equations: Geometry

Chapter 3. Systems of Linear Equations: Geometry Chapter 3 Systems of Linear Equations: Geometry Motiation We ant to think about the algebra in linear algebra (systems of equations and their solution sets) in terms of geometry (points, lines, planes,

More information

Lecture 3a: The Origin of Variational Bayes

Lecture 3a: The Origin of Variational Bayes CSC535: 013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton The origin of variational Bayes In variational Bayes, e approximate the true posterior across parameters

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008 COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008 In the previous lecture, e ere introduced to the SVM algorithm and its basic motivation

More information

Artificial Neural Networks. Part 2

Artificial Neural Networks. Part 2 Artificial Neural Netorks Part Artificial Neuron Model Folloing simplified model of real neurons is also knon as a Threshold Logic Unit x McCullouch-Pitts neuron (943) x x n n Body of neuron f out Biological

More information

Linear Discriminant Functions

Linear Discriminant Functions Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and

More information

Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment

Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment Enhancing Generalization Capability of SVM Classifiers ith Feature Weight Adjustment Xizhao Wang and Qiang He College of Mathematics and Computer Science, Hebei University, Baoding 07002, Hebei, China

More information

A Modified ν-sv Method For Simplified Regression A. Shilton M. Palaniswami

A Modified ν-sv Method For Simplified Regression A. Shilton M. Palaniswami A Modified ν-sv Method For Simplified Regression A Shilton apsh@eemuozau), M Palanisami sami@eemuozau) Department of Electrical and Electronic Engineering, University of Melbourne, Victoria - 3 Australia

More information

Announcements Wednesday, September 06

Announcements Wednesday, September 06 Announcements Wednesday, September 06 WeBWorK due on Wednesday at 11:59pm. The quiz on Friday coers through 1.2 (last eek s material). My office is Skiles 244 and my office hours are Monday, 1 3pm and

More information

The Probability of Pathogenicity in Clinical Genetic Testing: A Solution for the Variant of Uncertain Significance

The Probability of Pathogenicity in Clinical Genetic Testing: A Solution for the Variant of Uncertain Significance International Journal of Statistics and Probability; Vol. 5, No. 4; July 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education The Probability of Pathogenicity in Clinical

More information

A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT

A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT Progress In Electromagnetics Research Letters, Vol. 16, 53 60, 2010 A ROBUST BEAMFORMER BASED ON WEIGHTED SPARSE CONSTRAINT Y. P. Liu and Q. Wan School of Electronic Engineering University of Electronic

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

Consider this problem. A person s utility function depends on consumption and leisure. Of his market time (the complement to leisure), h t

Consider this problem. A person s utility function depends on consumption and leisure. Of his market time (the complement to leisure), h t VI. INEQUALITY CONSTRAINED OPTIMIZATION Application of the Kuhn-Tucker conditions to inequality constrained optimization problems is another very, very important skill to your career as an economist. If

More information

Markov Topic Models. Bo Thiesson, Christopher Meek Microsoft Research One Microsoft Way Redmond, WA 98052

Markov Topic Models. Bo Thiesson, Christopher Meek Microsoft Research One Microsoft Way Redmond, WA 98052 Chong Wang Computer Science Dept. Princeton University Princeton, NJ 08540 Bo Thiesson, Christopher Meek Microsoft Research One Microsoft Way Redmond, WA 9805 David Blei Computer Science Dept. Princeton

More information

Bivariate Uniqueness in the Logistic Recursive Distributional Equation

Bivariate Uniqueness in the Logistic Recursive Distributional Equation Bivariate Uniqueness in the Logistic Recursive Distributional Equation Antar Bandyopadhyay Technical Report # 629 University of California Department of Statistics 367 Evans Hall # 3860 Berkeley CA 94720-3860

More information

CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS

CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS INTRODUCTION David D. Bennink, Center for NDE Anna L. Pate, Engineering Science and Mechanics Ioa State University Ames, Ioa 50011 In any ultrasonic

More information

Learning in Zero-Sum Team Markov Games using Factored Value Functions

Learning in Zero-Sum Team Markov Games using Factored Value Functions Learning in Zero-Sum Team Markov Games using Factored Value Functions Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 27708 mgl@cs.duke.edu Ronald Parr Department of Computer

More information

Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features

Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features Eun Yong Kang Department

More information

MMSE Equalizer Design

MMSE Equalizer Design MMSE Equalizer Design Phil Schniter March 6, 2008 [k] a[m] P a [k] g[k] m[k] h[k] + ṽ[k] q[k] y [k] P y[m] For a trivial channel (i.e., h[k] = δ[k]), e kno that the use of square-root raisedcosine (SRRC)

More information

Econ 201: Problem Set 3 Answers

Econ 201: Problem Set 3 Answers Econ 20: Problem Set 3 Ansers Instructor: Alexandre Sollaci T.A.: Ryan Hughes Winter 208 Question a) The firm s fixed cost is F C = a and variable costs are T V Cq) = 2 bq2. b) As seen in class, the optimal

More information

ROBUST LINEAR DISCRIMINANT ANALYSIS WITH A LAPLACIAN ASSUMPTION ON PROJECTION DISTRIBUTION

ROBUST LINEAR DISCRIMINANT ANALYSIS WITH A LAPLACIAN ASSUMPTION ON PROJECTION DISTRIBUTION ROBUST LINEAR DISCRIMINANT ANALYSIS WITH A LAPLACIAN ASSUMPTION ON PROJECTION DISTRIBUTION Shujian Yu, Zheng Cao, Xiubao Jiang Department of Electrical and Computer Engineering, University of Florida 0

More information

Business Cycles: The Classical Approach

Business Cycles: The Classical Approach San Francisco State University ECON 302 Business Cycles: The Classical Approach Introduction Michael Bar Recall from the introduction that the output per capita in the U.S. is groing steady, but there

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

STATC141 Spring 2005 The materials are from Pairwise Sequence Alignment by Robert Giegerich and David Wheeler

STATC141 Spring 2005 The materials are from Pairwise Sequence Alignment by Robert Giegerich and David Wheeler STATC141 Spring 2005 The materials are from Pairise Sequence Alignment by Robert Giegerich and David Wheeler Lecture 6, 02/08/05 The analysis of multiple DNA or protein sequences (I) Sequence similarity

More information

Factored Markov Decision Process with Imprecise Transition Probabilities

Factored Markov Decision Process with Imprecise Transition Probabilities Factored Markov Decision Process with Imprecise Transition Probabilities Karina V. Delgado EACH-University of Sao Paulo Sao Paulo - Brazil Leliane N. de Barros IME-University of Sao Paulo Sao Paulo - Brazil

More information

A Generalization of a result of Catlin: 2-factors in line graphs

A Generalization of a result of Catlin: 2-factors in line graphs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 72(2) (2018), Pages 164 184 A Generalization of a result of Catlin: 2-factors in line graphs Ronald J. Gould Emory University Atlanta, Georgia U.S.A. rg@mathcs.emory.edu

More information

Introduction. next up previous Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks

Introduction. next up previous Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks Introduction Autonomous agents often find themselves facing a great deal of uncertainty. It decides how to act based on its perceived

More information

Social Network Games

Social Network Games CWI and University of Amsterdam Based on joint orks ith Evangelos Markakis and Sunil Simon The model Social netork ([Apt, Markakis 2011]) Weighted directed graph: G = (V,,), here V: a finite set of agents,

More information

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7. Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Sampling Methods for Action Selection in Influence Diagrams

Sampling Methods for Action Selection in Influence Diagrams From: AAAI- Proceedings. Copyright, AAAI (.aaai.org). All rights reserved. Sampling Methods for Action Selection in Influence Diagrams Luis E. Ortiz Computer Science Department Bron University Box 191

More information

3 - Vector Spaces Definition vector space linear space u, v,

3 - Vector Spaces Definition vector space linear space u, v, 3 - Vector Spaces Vectors in R and R 3 are essentially matrices. They can be vieed either as column vectors (matrices of size and 3, respectively) or ro vectors ( and 3 matrices). The addition and scalar

More information

arxiv: v1 [cs.cc] 5 Dec 2018

arxiv: v1 [cs.cc] 5 Dec 2018 Consistency for 0 1 Programming Danial Davarnia 1 and J. N. Hooker 2 1 Iowa state University davarnia@iastate.edu 2 Carnegie Mellon University jh38@andrew.cmu.edu arxiv:1812.02215v1 [cs.cc] 5 Dec 2018

More information

Multi-Attribute Bayesian Optimization under Utility Uncertainty

Multi-Attribute Bayesian Optimization under Utility Uncertainty Multi-Attribute Bayesian Optimization under Utility Uncertainty Raul Astudillo Cornell University Ithaca, NY 14853 ra598@cornell.edu Peter I. Frazier Cornell University Ithaca, NY 14853 pf98@cornell.edu

More information

COS 424: Interacting with Data. Lecturer: Rob Schapire Lecture #15 Scribe: Haipeng Zheng April 5, 2007

COS 424: Interacting with Data. Lecturer: Rob Schapire Lecture #15 Scribe: Haipeng Zheng April 5, 2007 COS 424: Interacting ith Data Lecturer: Rob Schapire Lecture #15 Scribe: Haipeng Zheng April 5, 2007 Recapitulation of Last Lecture In linear regression, e need to avoid adding too much richness to the

More information

arxiv: v1 [cs.ai] 28 Jun 2016

arxiv: v1 [cs.ai] 28 Jun 2016 On the Semantic Relationship beteen Probabilistic Soft Logic and Markov Logic Joohyung Lee and Yi Wang School of Computing, Informatics, and Decision Systems Engineering Arizona State University Tempe,

More information

Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation

Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation Andre Barron Department of Statistics Yale University Email: andre.barron@yale.edu Teemu Roos HIIT & Dept of Computer Science

More information

Minimize Cost of Materials

Minimize Cost of Materials Question 1: Ho do you find the optimal dimensions of a product? The size and shape of a product influences its functionality as ell as the cost to construct the product. If the dimensions of a product

More information

Why do Golf Balls have Dimples on Their Surfaces?

Why do Golf Balls have Dimples on Their Surfaces? Name: Partner(s): 1101 Section: Desk # Date: Why do Golf Balls have Dimples on Their Surfaces? Purpose: To study the drag force on objects ith different surfaces, ith the help of a ind tunnel. Overvie

More information

Bloom Filters and Locality-Sensitive Hashing

Bloom Filters and Locality-Sensitive Hashing Randomized Algorithms, Summer 2016 Bloom Filters and Locality-Sensitive Hashing Instructor: Thomas Kesselheim and Kurt Mehlhorn 1 Notation Lecture 4 (6 pages) When e talk about the probability of an event,

More information

PROBABILISTIC LOGIC. J. Webster (ed.), Wiley Encyclopedia of Electrical and Electronics Engineering Copyright c 1999 John Wiley & Sons, Inc.

PROBABILISTIC LOGIC. J. Webster (ed.), Wiley Encyclopedia of Electrical and Electronics Engineering Copyright c 1999 John Wiley & Sons, Inc. J. Webster (ed.), Wiley Encyclopedia of Electrical and Electronics Engineering Copyright c 1999 John Wiley & Sons, Inc. PROBABILISTIC LOGIC A deductive argument is a claim of the form: If P 1, P 2,...,andP

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Long run input use-input price relations and the cost function Hessian. Ian Steedman Manchester Metropolitan University

Long run input use-input price relations and the cost function Hessian. Ian Steedman Manchester Metropolitan University Long run input use-input price relations and the cost function Hessian Ian Steedman Manchester Metropolitan University Abstract By definition, to compare alternative long run equilibria is to compare alternative

More information

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing

More information

N-bit Parity Neural Networks with minimum number of threshold neurons

N-bit Parity Neural Networks with minimum number of threshold neurons Open Eng. 2016; 6:309 313 Research Article Open Access Marat Z. Arslanov*, Zhazira E. Amirgalieva, and Chingiz A. Kenshimov N-bit Parity Neural Netorks ith minimum number of threshold neurons DOI 10.1515/eng-2016-0037

More information

Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process

Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process uca Santillo (luca.santillo@gmail.com) Abstract COSMIC-FFP is a rigorous measurement method that makes possible to measure the functional

More information

Definition of a new Parameter for use in Active Structural Acoustic Control

Definition of a new Parameter for use in Active Structural Acoustic Control Definition of a ne Parameter for use in Active Structural Acoustic Control Brigham Young University Abstract-- A ne parameter as recently developed by Jeffery M. Fisher (M.S.) for use in Active Structural

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 Probabilistic

More information

Expectation Propagation in Factor Graphs: A Tutorial

Expectation Propagation in Factor Graphs: A Tutorial DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational

More information

Endogeny for the Logistic Recursive Distributional Equation

Endogeny for the Logistic Recursive Distributional Equation Zeitschrift für Analysis und ihre Anendungen Journal for Analysis and its Applications Volume 30 20, 237 25 DOI: 0.47/ZAA/433 c European Mathematical Society Endogeny for the Logistic Recursive Distributional

More information

(1) Sensitivity = 10 0 g x. (2) m Size = Frequency = s (4) Slenderness = 10w (5)

(1) Sensitivity = 10 0 g x. (2) m Size = Frequency = s (4) Slenderness = 10w (5) 1 ME 26 Jan. May 2010, Homeork #1 Solution; G.K. Ananthasuresh, IISc, Bangalore Figure 1 shos the top-vie of the accelerometer. We need to determine the values of, l, and s such that the specified requirements

More information

Planning by Probabilistic Inference

Planning by Probabilistic Inference Planning by Probabilistic Inference Hagai Attias Microsoft Research 1 Microsoft Way Redmond, WA 98052 Abstract This paper presents and demonstrates a new approach to the problem of planning under uncertainty.

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

GRADIENT DESCENT. CSE 559A: Computer Vision GRADIENT DESCENT GRADIENT DESCENT [0, 1] Pr(y = 1) w T x. 1 f (x; θ) = 1 f (x; θ) = exp( w T x)

GRADIENT DESCENT. CSE 559A: Computer Vision GRADIENT DESCENT GRADIENT DESCENT [0, 1] Pr(y = 1) w T x. 1 f (x; θ) = 1 f (x; θ) = exp( w T x) 0 x x x CSE 559A: Computer Vision For Binary Classification: [0, ] f (x; ) = σ( x) = exp( x) + exp( x) Output is interpreted as probability Pr(y = ) x are the log-odds. Fall 207: -R: :30-pm @ Lopata 0

More information

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Robust Monte Carlo Methods for Sequential Planning and Decision Making Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory

More information

Recovering Probability Distributions from Missing Data

Recovering Probability Distributions from Missing Data Proceedings of Machine Learning Research 77:574 589, 2017 ACML 2017 Recovering Probability Distributions from Missing Data Jin Tian Iowa State University jtian@iastate.edu Editors: Yung-Kyun Noh and Min-Ling

More information

reader is comfortable with the meaning of expressions such as # g, f (g) >, and their

reader is comfortable with the meaning of expressions such as # g, f (g) >, and their The Abelian Hidden Subgroup Problem Stephen McAdam Department of Mathematics University of Texas at Austin mcadam@math.utexas.edu Introduction: The general Hidden Subgroup Problem is as follos. Let G be

More information

Symbolic Dynamic Programming for Continuous State and Observation POMDPs

Symbolic Dynamic Programming for Continuous State and Observation POMDPs Symbolic Dynamic Programming for Continuous State and Observation POMDPs Zahra Zamani ANU & NICTA Canberra, Australia zahra.zamani@anu.edu.au Pascal Poupart U. of Waterloo Waterloo, Canada ppoupart@uwaterloo.ca

More information

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Announcements Wednesday, September 06. WeBWorK due today at 11:59pm. The quiz on Friday covers through Section 1.2 (last weeks material)

Announcements Wednesday, September 06. WeBWorK due today at 11:59pm. The quiz on Friday covers through Section 1.2 (last weeks material) Announcements Wednesday, September 06 WeBWorK due today at 11:59pm. The quiz on Friday coers through Section 1.2 (last eeks material) Announcements Wednesday, September 06 Good references about applications(introductions

More information

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4 Inference: Exploiting Local Structure aphne Koller Stanford University CS228 Handout #4 We have seen that N inference exploits the network structure, in particular the conditional independence and the

More information

Symbolic Dynamic Programming for Continuous State and Observation POMDPs

Symbolic Dynamic Programming for Continuous State and Observation POMDPs Symbolic Dynamic Programming for Continuous State and Observation POMDPs Zahra Zamani ANU & NICTA Canberra, Australia zahra.zamani@anu.edu.au Pascal Poupart U. of Waterloo Waterloo, Canada ppoupart@uwaterloo.ca

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism

Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism Toards a Theory of Societal Co-Evolution: Individualism versus Collectivism Kartik Ahuja, Simpson Zhang and Mihaela van der Schaar Department of Electrical Engineering, Department of Economics, UCLA Theorem

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models JMLR Workshop and Conference Proceedings 6:17 164 NIPS 28 workshop on causality Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University

More information

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 26: Probability and Random Processes Problem Set Spring 209 Self-Graded Scores Due:.59 PM, Monday, February 4, 209 Submit your

More information

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015 S 88 Spring 205 Introduction to rtificial Intelligence Midterm 2 V ˆ You have approximately 2 hours and 50 minutes. ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Conservative Edge Sparsification for Graph SLAM Node Removal

Conservative Edge Sparsification for Graph SLAM Node Removal Conservative Edge Sparsification for Graph SLAM Node Removal Nicholas Carlevaris-Bianco and Ryan M. Eustice Abstract This paper reports on optimization-based methods for producing a sparse, conservative

More information

Preference Elicitation for Sequential Decision Problems

Preference Elicitation for Sequential Decision Problems Preference Elicitation for Sequential Decision Problems Kevin Regan University of Toronto Introduction 2 Motivation Focus: Computational approaches to sequential decision making under uncertainty These

More information

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks Yingfei Wang, Chu Wang and Warren B. Powell Princeton University Yingfei Wang Optimal Learning Methods June 22, 2016

More information

Nonlinear Projection Trick in kernel methods: An alternative to the Kernel Trick

Nonlinear Projection Trick in kernel methods: An alternative to the Kernel Trick Nonlinear Projection Trick in kernel methods: An alternative to the Kernel Trick Nojun Kak, Member, IEEE, Abstract In kernel methods such as kernel PCA and support vector machines, the so called kernel

More information

Machine Learning: Logistic Regression. Lecture 04

Machine Learning: Logistic Regression. Lecture 04 Machine Learning: Logistic Regression Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Supervised Learning Task = learn an (unkon function t : X T that maps input

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information

Learning to Learn and Collaborative Filtering

Learning to Learn and Collaborative Filtering Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany

More information

Minimax strategy for prediction with expert advice under stochastic assumptions

Minimax strategy for prediction with expert advice under stochastic assumptions Minimax strategy for prediction ith expert advice under stochastic assumptions Wojciech Kotłosi Poznań University of Technology, Poland otlosi@cs.put.poznan.pl Abstract We consider the setting of prediction

More information

Energy Minimization via a Primal-Dual Algorithm for a Convex Program

Energy Minimization via a Primal-Dual Algorithm for a Convex Program Energy Minimization via a Primal-Dual Algorithm for a Convex Program Evripidis Bampis 1,,, Vincent Chau 2,, Dimitrios Letsios 1,2,, Giorgio Lucarelli 1,2,,, and Ioannis Milis 3, 1 LIP6, Université Pierre

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

On the approximation of real powers of sparse, infinite, bounded and Hermitian matrices

On the approximation of real powers of sparse, infinite, bounded and Hermitian matrices On the approximation of real poers of sparse, infinite, bounded and Hermitian matrices Roman Werpachoski Center for Theoretical Physics, Al. Lotnikó 32/46 02-668 Warszaa, Poland Abstract We describe a

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

1 Effects of Regularization For this problem, you are required to implement everything by yourself and submit code.

1 Effects of Regularization For this problem, you are required to implement everything by yourself and submit code. This set is due pm, January 9 th, via Moodle. You are free to collaborate on all of the problems, subject to the collaboration policy stated in the syllabus. Please include any code ith your submission.

More information

arxiv: v1 [stat.ml] 15 Nov 2016

arxiv: v1 [stat.ml] 15 Nov 2016 Recoverability of Joint Distribution from Missing Data arxiv:1611.04709v1 [stat.ml] 15 Nov 2016 Jin Tian Department of Computer Science Iowa State University Ames, IA 50014 jtian@iastate.edu Abstract A

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information