Exact Bayesian Pairwise Preference Learning and Inference on the Uniform Convex Polytope

Size: px

Start display at page:

Download "Exact Bayesian Pairwise Preference Learning and Inference on the Uniform Convex Polytope"

Harry Craig
5 years ago
Views:

1 Exact Bayesian Pairise Preference Learning and Inference on the Uniform Convex Polytope Scott Sanner NICTA & ANU Canberra, ACT, Australia Ehsan Abbasnejad ANU & NICTA Canberra, ACT, Australia Abstract In Bayesian approaches to utility learning from preferences, the objective is to infer a posterior belief distribution over an agent s utility function based on previously observed agent preferences. From this, one can then estimate quantities such as the expected utility of a decision or the probability of an unobserved preference, hich can then be used to make or suggest future decisions on behalf of the agent. Hoever, there remains an open question as to ho one can represent beliefs over agent utilities, perform Bayesian updating based on observed agent pairise preferences, and make inferences ith this posterior distribution in an exact, closed-form. In this paper, e build on Bayesian pairise preference learning models under the assumptions of linearly additive multi-attribute utility functions and a bounded uniform utility prior. These assumptions lead to a posterior form that is a uniform distribution over a convex polytope for hich e then demonstrate ho to perform exact, closed-form inference.r.t. this posterior, i.e., ithout resorting to sampling or other approximation methods. 1 Introduction Decision support systems DSSs) require knoledge of an agent s preferences in order to make or suggest future decisions on their behalf. Clearly then, the most crucial task of a DSS is in learning to predict future agent preferences from previously observed agent preferences. In recent years, an increasingly popular approach to preference learning relies on Bayesian updating of posterior beliefs in an underlying multi-attribute) utility function conditioned on observed agent preferences [4, 5, 3, 6, 8, 1]; inference can then be performed by marginalizing over these posterior utility beliefs.r.t. future preference queries. This Bayesian preference learning BPL) is attractive for a multitude of reasons, among them a) the generally acknoledged robustness of Bayesian learning methods to overfitting, b) the fact that updates can be done online, and c) the fact that Bayesian methods provide a useful measure of confidence in their predictions via a probability distribution over predicted outcomes. Despite the attractiveness of BPL, e are not aare of a Bayesian pairise preference learning BPPL) approach that claims to do all learning and inference in an exact, closed-form. Arguably the to most common priors for BPPL are either Gaussian or Uniform in the former case, a Gaussian prior is not conjugate to commonly used pairise preference likelihoods leading to the need for approximate Bayesian updating in order to maintain a tractable form for the posterior [4, 3, 6, 8], in the latter case, certain Uniform distributions are indeed conjugate to commonly used pairise preference likelihoods, but inference ith the posterior convex polytope requires difficult integrals that in practice are approximated via sampling or other methods [5, 1]. In this paper, e argue that the exact integrals in the case of BPPL ith commonly used linearly additive multi-attribute utility functions and a uniform utility prior are not as computationally difficult 1

2 as they may appear. Indeed, armed ith symbolic inference for pieceise functions and efficient data structures derived from algebraic decision diagram ADD) for representing and performing operations on these functions, e sho that such inference can be computed efficiently for reasonably sized problems in practice. This is the first time to the best of the authors knoledge) that an explicit algorithm has been provided for doing BPPL and all associated inference in exact, closed-form. Bayesian Pairise Preference Learning BPPL) Here e follo much of the BPPL setup of [8], except for the case of a Uniform rather than Gaussian prior distribution. In multi-attribute utility theory MAUT) [9], utilities are modeled over a D-dimensional attribute vector x here each attribute choice x d 1 d D) is either binary x d {0 false), 1 true)}) or real-valued x d R). An item is described by its attribute choice assignments x = x 1,..., x D ). For example, let our item space consist of possible cars e might buy, let binary attribute choice x 1 represent hether the car is a -door if false, e assume it is a 4- door) and let x represent the amount of engine displacement in liters. Perhaps to of our available items are a.4l to-door Accord ith attribute vector x a = 1,.4) and a 3.5L four-door Buick ith attribute vector x b = 0, 3.5). Our goal in BPPL ill be to learn an attribute eight vector = R D that describes the utility of each attribute choice in each attribute dimension. As commonly done in MAUT [9, 1], e assume that the overall utility ux ) of item x.r.t. attribute eight vector decomposes additively over the attribute choices of x, i.e., D ux ) = d x d. 1) We let u represent the user s true utility.r.t. their true but hidden). Since is unknon to the decision support system, e take a Bayesian perspective on learning [4] and thus maintain a probability distribution P ) representing our beliefs over. For this, e must define our prior utility beliefs, likelihood preference query response model), the form of the posterior for our Bayesian update, and the form of any inferences e ish to make ith the posterior..1 Utility Prior For our prior beliefs P ), e start ith a hyper-rectangular multivariate uniform distribution over represented as follos P ) = D p d ) = D U d ; C, C). ) here C R C 0) is some constant 1 and the uniform distribution U is simply defined using an indicator function : U d ; L, U) = 1 U L I[ d L) d U)]. By using respective loer and upper bounds of C and C for each uniform dimension of the prior, e enforce that E P ) [] = 0 indicating agnostic prior beliefs about the utility of each attribute. As it ill be important observation for future sections, e remark that such a hyperrectangular uniform prior as defined here can also be conveniently vieed as the intersection product) of halfspaces given by the indicator definition of U.. Query & User Response Model In this paper, e focus on pairise comparison queries knon to require lo cognitive load for users to anser [7]. We use Q ab = {a b, a b, a b} to represent a pairise comparison query 1 In practice, C could be different for each dimension, if justified. We use I[ ] as an indicator function taking the value 1 hen its argument is true and 0 otherise.

3 indicating the user s preferences of item x a vs. item x b henceforth just a and b). Depending on the user s attribute eight vector and the corresponding item utilities, ua ) and ub ), the user s response q ab Q ab indicates the folloing: a b: the user prefers a to b, a b: the user prefers a to b, a b: the user is indifferent beteen a and b. We represent the user query model in 5) ith an indicator function over the pairise utility differences as follos P Q ab = a b ) = I[ua ) ub ) > ɛ] P Q ab = a b ) = I[ub ) ua ) > ɛ] P Q ab = a b ) = I[ ua ) ub ) ɛ], 3) here e can modulate the range of utility differences for hich the user is indifferent by adjusting ɛ. Note that by definition, q ab P q ab ) = 1. If needed, e could easily model more noise in the elicitation process simply by alloing non-zero probability for eight vectors that disagree ith the query response. For example, here e model a 10% chance that a user s query response disagrees ith their utility valuation of items a and b by more than ɛ: P Q ab = a b ) = 0.9 I[ua ) ub ) > ɛ] I[ua ) ub )] ɛ. 4) Whatever exact form of the above query response model is used, e note that due to the linearity of u ) in, all of the above query response likelihood models can be represented by a sum of) linearly separated half-spaces oing to the the linear constraints in the likelihood indicator functions..3 Bayesian Utility Updating Given a prior utility belief P R n ).r.t. a possibly empty) set of n 0 query responses R n = {q kl } and a ne query response q ab, e perform the folloing Bayesian update to obtain a posterior belief P R n+1 ) here R n+1 = R n {q ab }: P R n+1 ) P q ab, R n )P R n ) P q ab )P R n ). 5) Assuming that our query likelihood P q ab ) is modeled as described in 3) [or 4)], e note that since the prior is a product of linear half-spaces and the likelihood also [a sum of] linear half-spaces and these are multiplied, the resulting posterior is a uniform convex polytope 3 [or after placing the result in sum of products form, a sum of uniform convex polytopes]. Hence, an unnormalized Bayesian update appears to be computationally straightforard and in fact a very efficient OnD) complexity for Bayesian updating for the case of the likelihood model in 3). In Figure 1, e sho a D = representation of a posterior for a likelihood in the form of 3).4 Preference Inference No that e have posterior beliefs in our utility eighting P R n ), e ould like to perform at least to types of inference: Item ith maximum expected utility: arg max x = arg max x E [ ux ) R n ] [ D ] P R n ) d x d d 3 Here an unnormalized uniform distribution on the convex polytope defined by the product of linear constraints in the indicator functions. 3

4 q ab q cd 1 Figure 1: A representation of the uniform convex polytope representing the posterior distribution on hen using the likelihood model in 3). The union of the striped and cross-hatched area shos the non-zero posterior polytope before updating ith query response q ab and the cross-hatched area shos the non-zero posterior polytope after updating ith query response q ab. We note that query response q cd ould not lead to any change in either posterior before or after updating ith q cd and its constraint could be pruned from the constraint set. Probability of preference outcomes: P q ab R n ) = P q ab )P R n )d If using likelihood form 3), e note that both inferences are volume integrals over a function ux ) in the first case and constant in the latter case) subject to boundary constraints over a convex polytope defined.r.t. P R n ) appearing in both inferences) and P q ab ) only for the second inference). In the case of likelihood form 3), e simply get a sum of such integrals. Consequently, the question e have to ask to perform exact inference in this Uniform BPPL setting is the folloing: given a function defined over the region of a convex polytope like the one in Figure 1, ho do e compute this integral in closed-form? At first, this ould seem quite difficult the boundaries in any dimension in Figure 1 may be a pieceise function of the other dimensions but on closer inspection, there is a systematic ay to compute these integrals if given the right representation. We discuss this next. 3 Symbolic Variable Elimination in Discrete/Continuous Graphical Models In this section, e introduce a case notation and operations for pieceise polynomial functions and demonstrate that all Uniform BPPL operations, including the required integrals can be computed exactly and in closed-form using a purely symbolic representation. 3.1 Case Representation and Operators For this section, e ill assume all functions are represented in case form [11] as follos: φ 1 f 1 f =.... 6) φ k f k Here, the f i may be polynomial functions of although initially the functions f i ill be constant or linear, e ill see that integration ill introduce higher order polynomials for f i ). The φ i are logical formulae defined over that can consist of arbitrary logical combinations of inequalities, >,, <) over linear functions of. We assume that the set of conditions {φ 1,..., φ k } disjointly and exhaustively partition such that f is ell-defined for all R D. It is easy to verify that such a representation can represent all of the functions discussed in Section.3. Unary operations such as scalar multiplication c f for some constant c R) or negation f on case statements f are straightforard; the unary operation is simply applied to each f i 1 i k). 4

5 Intuitively, to perform a binary operation on to case statements, e simply take the cross-product of the logical partitions of each case statement and perform the corresponding operation on the resulting paired partitions. Thus, e perform the cross-sum of to unnamed) cases as follos: φ 1 : f 1 φ : f ψ 1 : g 1 ψ : g = 8 φ 1 ψ 1 : f 1 + g 1 >< φ 1 ψ : f 1 + g φ ψ 1 : f + g 1 >: φ ψ : f + g Likeise, e perform and by, respectively, subtracting or multiplying partition values rather than adding them) to obtain the result. Some partitions resulting from the application of the,, and operators may be inconsistent infeasible); if e can detect this e.g., via a linear constraint solver), e may simply discard such partitions as they are irrelevant to the function value. For the Uniform BPPL inference defined in Section.4, e ll need to compute definite integrals a fairly non-trivial operation that is discussed in its on section. But first e discuss maximization needed for orking ith integral bounds. Symbolic maximization is fairly straightforard to define: max 8 φ 1 ψ 1 f 1 > g 1 : f 1 φ 1 ψ 1 f 1 g 1 : g 1! φ 1 ψ f 1 > g : f 1 φ 1 : f 1 ψ 1 : g >< 1 φ 1 ψ f 1 g : g, = φ : f ψ : g φ ψ 1 f > g 1 : f φ ψ 1 f g 1 : g 1 φ >: ψ f > g : f φ ψ f g : g The key observation here is that case statements are closed under the max operation. While it may appear that this representation ill lead to an unreasonable bloup in size, e note the XADD that e introduce later in Section 4 ill be able to exploit the internal decision structure of this maximization to represent it much more compactly. 3. Integration on the Convex Polytope We have almost all of the operations needed to perform Uniform BPPL inference as defined in Section.4, except for a method to compute each of the integrals 1,..., D.r.t. the convex polytope constraints such as those shon in Figure 1. We provide this integration technique here. If e are computing 1= f d 1 for f in 6), e can rerite it in the folloing equivalent form 1= I[φ i ] f i d 1 = i i 1= I[φ i ] f i d 1. 7) Hence e can compute the integrals separately for each case partition producing a case statement) and then the results using. To continue ith the integral for a single case partition, e introduce a concrete example. Let f 1 := 3 1 and φ 1 := [ 1 > 1] [ 1 > 1] [ 1 ] [ 1 +1] [ > 0]. In computing I[φ 1= 1] f 1 d 1, the first thing e note is that the constraints involving 1 in I[φ 1 ] can be used to restrict the integration range for 1. In fact, e kno that the integrand is only non-zero for max 1, 1) 1 min, + 1). Using the max and corresponding min operations defined previously, let us rite explicit functions for these bounds in pieceise linear case form for the respective loer and upper bounds LB and UB: 1 > 1 : 1 LB := 1 1 : 1 UB := < + 1 : + 1 : + 1 = 1 : < 1 : + 1 5

6 f b *x + z <= 10 *x + z <= 10 y <= 4 y >= -6 y - x >= - y >= -6 y <= 4 y <= 4 *x + z <= 10 *x + z <= 10 x + z y - x >= - y >= -6 y - x >= y - x >= - y - x >= - y - x >= - y - x >= - y - x >= - y - x <= 3 y <= x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z x + z Figure : Left: a tree representation of a case statement: each branch corresponds to a single case partition. Right: minimal XADD representation of the same case statement, note the compact DAG structure. No e can rerite the integral as 4 I[ > 0] UB 1=LB 3 1 ) d 1. 8) Note here that I[ > 0] is independent of 1 and hence can factor outside the integral. With all indicator functions moved into the LB or UB or factored out, e can no compute the integral: [ ] 3 1=UB I[ > 0] ) 1=LB The question no is simply ho to do this evaluation? Here e note that every expression variables, constants, indicator functions, etc.) can be ritten as a simple case statements or as operations on case statements, even the upper and loer bounds as shon previously. So the evaluation is simply: I[ > 0]» 3 UB UB UB» 3 LB LB LB Hence the result of the definite integration over a case partition of a polynomial function ith pieceise linear constraints is a case statement ith pieceise linear constraints over polynomial functions this is somehat remarkable given that all of the bound computations ere symbolic. Hoever, e are not yet done, there is one final step that e must include for correctness. Because our bounds are symbolic, it may be the case for some assignment to b, ) that LB UB this occurs in infeasible portions of the convex polytope). In this case the integral should be zero since the constaints on 1 could not be jointly satisfied. To enforce this symbolically, e simply need to 10) by case statements representing the folloing inequalities for all pairs of upper and loer bounds: I[ > 1] I[ > 1] I[ + 1 > 1] I[ + 1 > 1] 11) Of course, here I[ + 1 > 1] could be removed as a tautology, but not the other constraints. This provides the solution for a single case partition and from 7), e just need to the case statements from each case partition integral evaluation to obtain the full integral, still in case form. Repeating this for all integrals 1,..., D, e can obtain closed-form exact results for all queries in Section.4! 4 Extended ADDs XADDs) for Case Statements In practice, it can be prohibitively expensive to maintain a case statement representation of a value function ith explicit partitions. Motivated by algebraic decision diagrams ADDs) [], hich maintain compact representations for finite discrete functions, e use an extended ADD XADD) formalism introduced in [11] and demonstrated in Figure right). 4 The careful reader ill note that because the loer bounds ere defined in terms of > rather than, e technically have an improper integral and need to take a limit. Hoever, in taking the limit, e note that all integrands are continuous polynomials of order 0 or greater, so the limit exists and yields the same anser as substituting the limit value. Hence for polynomials, e need not be concerned about hether bounds are inclusive or not. 10) 6

7 x 10 5 Elapsed time ms) Constraints Dimension 4 Figure 3: The performance of expected utility computation of 1) in terms of time in ms ith various dimensionality utility attributes) and number of constraints query responses). In brief e note that an XADD is like an algebraic decision diagram ADD) [] except that a) the decision nodes can have arbitrary inequalities, equalities, or disequalities one per node) and b) the leaf nodes can represent arbitrary functions. The decision nodes still have a fixed order from root to leaf and the standard ADD operations to build a canonical ADD REDUCE) and to perform a binary operation on to ADDs APPLY) still applies in the case of XADDs. Of particular importance is that the XADD is a directed acyclic graph DAG), and hence is often much more compact than a tree representation. For example, in Figure left) e provide the function represented by the XADD in Figure right) expanded from a DAG into it s full tree form. We note that not only is the XADD much more compact on account of its DAG, but that all unary and binary operations on XADDs can directly exploit this compact structure for efficiency. For more compactness and hence efficiency, e can apply a linear constraint solver to prune out unreachable branches of the convex polytope so that for example, dominated polytope constraints for preference q cd in Figure 1 can simply be pruned from the XADD. To this end, e can use the XADD to perform all case operations required for inference in Section.3 and hence efficiently perform the Uniform BPPL queries from Section.4 as e demonstrate next. 5 Empirical Results In this section e demonstrate that performing the inference from Section.4 is efficient in most cases for a reasonable number of attribute dimensions and pairise query response observations. In Figure 3, e illustrate ho the algorithm performs as the number of attributes D i.e., dimensionality) and the size n) of the query response set R n i.e., number of constraints) vary. To run this experiment, e randomly generated an attribute eighting vector of dimensionality D as ell as n query responses R n ensuring all responses maintained a feasible utility region), performed Bayesian updating under the likelihood model of 3), and then carried out the expected utility subquery from Section.4: [ E ux ) R n ] = arg max P R n ) x [ D ] d x d d 1) We recorded the total time of this inference as a function of the dimensionality D and number of constraints n. As e can observe in Figure 3, in general the expected utility inference runs ithin a reasonable time for most cases the flat surface is not zero, but rather on the order of 100 ms or less). Curiously, it is only for a certain combination of constraints and dimensionality that the algorithm fails to run fast enough for real-time performance empirically, it seems that over-constrained and underconstrained problems for loer dimensions are easiest for inference. It remains to explain exactly hy this happens intuitively, one ould have thought the smaller the number of constaints, the easier the pieceise integrals can be evaluated. We are currently investigating this phenomena in more detail. 7

8 6 Conclusions and Future Work This ork presented novel and efficient, exact Bayesian learning and inference methods for preference elicitation on the Uniform distribution on the convex polytope. In future ork, e aim to investigate the folloing possible extensions: Generalized additive independence [9, 1]: can e introduce additive independence over joint utility factors rather than individual utility attributes hile still achieving efficient computation? In principle, this seems relatively straightforard. Nonlinear utility functions: under hat conditions can e use nonlinear utility functions and still compute the inferences from Section.4? Multi-user utility inference: ho can e extend the inference in this paper to leverage query responses from multiple users in a Bayesian collaborative filtering [10] frameork? References [1] Fahiem Bacchus and Adam Grove. Graphical models for preference and utility. In Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference 1995), pages 3 10, San Francisco, Morgan Kaufmann Publishers. [] R. Iris Bahar, Erica Frohm, Charles Gaona, Gary Hachtel, Enrico Macii, Abelardo Pardo, and Fabio Somenzi. Algebraic Decision Diagrams and their applications. In IEEE /ACM International Conference on CAD, [3] Craig Boutilier. A POMDP formulation of preference elicitation problems. In Eighteenth National Conference on Artificial Intelligence, pages 39 46, Menlo Park, CA, USA, 00. American Association for Artificial Intelligence. [4] Urszula Chajeska and Daphne Koller. Utilities as random variables: Density estimation and structure discovery. In In UAI, pages 63 71, 000. [5] Urszula Chajeska, Daphne Koller, and Dirk Ormoneit. Learning an agent s utility function by observing behavior. In In Proc. of the 18th Intl Conf. on Machine Learning, pages 35 4, 001. [6] Wei Chu and Zoubin Ghahramani. Preference learning ith gaussian processes. In ICML, pages , 005. [7] Vincent Conitzer. Eliciting single-peaked preferences using comparison queries. Journal of Artificial Intelligence Research, 35: , 009. [8] S. Guo and S. Sanner. Real-time multiattribute Bayesian preference elicitation ith pairise comparison queries. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics AISTATS-10), volume 9, pages 89 96, Sardinia, Italy, May [9] R.L. Keeney and H. Raiffa. Decisions ith multiple objectives: Preferences and value tradeoffs. J. Wiley, Ne York, [10] Paul Resnick and Hal R. Varian. Recommender systems. Communications of the ACM, 40:56 58, March [11] Scott Sanenr, Karina Valdivia Delgado, and Leliane Nunes de Barros. Symbolic dynamic programming for discrete and continuous state mdps. In UAI-011, Barcelona, 011. [1] Paolo Viappiani and Craig Boutilier. Optimal bayesian recommendation sets and myopically optimal choice query sets. In Advances in Neural Information Processing Systems 3 NIPS). MIT press,

Data Structures for Efficient Inference and Optimization

Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete