Data Structures for Efficient Inference and Optimization

Size: px

Start display at page:

Download "Data Structures for Efficient Inference and Optimization"

Irma Booth
5 years ago
Views:

1 Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang

2 Discrete & Continuous HMMs No one previously did this inference exactly in closed-form!

3 Exact Closed-form Continuous Inference Fully Gaussian Most inference including conditional Fully Uniform 1D, n-d hyperrectangular cases General Uniform Yes, but not a solution you can write on 1 sheet of paper Piecewise, Asymmetrical, Multimodal Exact (conditional) inference possible in closed-form?

4 What has everyone been missing? Symbolic representations and operations on piecewise functions

5 Piecewise Functions (Cases) z = f(x; y) = Constraint ( (x > 3) ^ (y x) : x + y (x 3) _ (y > x) : x 2 + xy 3 Partition Value Linear constraints and value Linear constraints, constant value Quadratic constraints and value

6 Formal Problem Statement General continuous graphical models represented by piecewise functions (cases) 8 Á >< 1 : f 1 f =.. >: Á k : Exact closed-form solution inferred via the following piecewise calculus: f k f 1 f 2, f 1 f 2 max( f 1, f 2 ), min( f 1, f 2 ) x f(x) max x f(x), min x f(x) Question: how do we perform these operations in closed-form?

7 Polynomial Case Operations:, ( Á 1 : f 1 Á 2 : f 2 ( Ã 1 : g 1 Ã 2 : g 2 =?

8 Polynomial Case Operations:, ( Á 1 : f 1 Á 2 : f 2 ( Ã 1 : g 1 Ã 2 : g 2 = 8 Á 1 ^ Ã 1 : f 1 + g 1 >< Á 1 ^ Ã 2 : f 1 + g 2 Á 2 ^ Ã 1 : f 2 + g 1 >: Á 2 ^ Ã 2 : f 2 + g 2 Similarly for Polynomials closed under +, * What about max? Max of polynomials is not a polynomial

9 Polynomial Case Operations: max max Ã ( Á 1 : f 1 Á 2 : f 2 ; ( Ã 1 : g 1 Ã 2 : g 2! =?

10 Polynomial Case Operations: max 8 Á 1 ^ Ã 1 ^ f 1 > g 1 : f 1 Á 1 ^ Ã 1 ^ f 1 g 1 : g 1 max Ã ( Á 1 : f 1 Á 2 : f 2 ; ( Ã 1 : g 1 Ã 2 : g 2! = Á 1 ^ Ã 2 ^ f 1 > g 2 : f 1 >< Á 1 ^ Ã 2 ^ f 1 g 2 : g 2 Á 2 ^ Ã 1 ^ f 2 > g 1 : f 2 Á 2 ^ Ã 1 ^ f 2 g 1 : g 1 Á 2 ^ Ã 2 ^ f 2 > g 2 : f 2 >: Á 2 ^ Ã 2 ^ f 2 g 2 : g 2 Still a piecewise polynomial! Size blowup? We ll get to that

11 Integration: x x closed for polynomials But how to compute for case? Z x 8 Á >< 1 : f 1.. dx = >: Á k : f k Z x = X i kx [Á i ] f i dx i=1 Z x [Á i ] f i dx Just integrate case partitions, results!

12 Partition Integral 1. Determine integration bounds Z x [Á 1 ] f 1 dx Á 1 := [x > 1] ^ [x > y 1] ^ [x z] ^ [x y + 1] ^ [y > 0] f 1 := x 2 xy LB := ( y 1 > 1 : y 1 y 1 1 : 1 UB := ( z < y + 1 : z z y + 1 : y + 1 What constraints here? independent of x pairwise UB > LB [Á cons ] UB R x = LB f 1 dx UB and LB are symbolic! How to evaluate?

13 Definite Integral Evaluation How to evaluate integral bounds? LB := Z UB x=lb ( y 1 > 1 : y 1 y 1 1 : 1 x 2 xy = 1 3 x3 1 2 x2 y UB := UB LB ( z < y + 1 : z z y + 1 : y + 1 Can do polynomial operations on cases! f 1 UB LB 1 = 3 UB UB UB ª 1 UB UB (y) 2 1 ª 3 LB LB LB ª 1 LB LB (y) 2 Symbolically, exactly evaluated!

14 Exact Graphical Model Inference! (directed and undirected) Can do general probabilistic inference p(x 2 jx 1 ) = R x 3 R x n N k i=1 case i dx n dx 3 R x 2 R x n N k i=1 case i dx n dx 2 Or an exact expectation of any polynomial Z E x»p(xjo) [poly(x)jo] = p(xjo)poly(x)dx poly: mean, variance, skew, curtosis,, x 2 +y 2 +xy x All computed by Symbolic Variable Elimination (SVE)

15 Voila: Closed-form Exact Inference via SVE! In theory

16 An Expressive Conjugate Prior for Bayesian Inference General Bayesian Inference p( ~ µjd n+1 ) / p(d n+1 j ~ µ)p( ~ µjd n ) case case case Prior & likelihood for computational convenience? No, choose as appropriate for your problem!

17 Bayesian Robotics D x 1,, x n D: true distance to wall x 1,, x n : measurements want: E[D x 1,,x n ]

18 Bayesian Robotics: Results Example posterior given measurements {3,5,8}: Do we really want to make Gaussian approximation? What if a stairwell here?

19 Symbolic Sequential Decision Optimization?

20 Continuous State MDPs 2-D Navigation State: (x,y) 2 R 2 y R(x,y) Goal = 1 = 0 Actions: move-x-2 x = x + 2 y = y move-y-2 x = x y = y + 2 Reward: R(x,y) = I[ (x > 5) ^ (x < 10) ^ (y > 2) ^ (y < 5) ] x Feng et al (UAI-04) Assumptions: 1. Continuous transitions are deterministic and linear 2. Discrete transitions can be stochastic 3. Reward is piecewise rectilinear convex

21 Exact Solutions to DC-MDPs: Regression Continuous regression is just translation of pieces V (x,y ) Feng et al, UAI-04 y = 1 = 0 x Q(move-x-2,x,y) Q(move-y-2,x,y) y y x x

22 Feng et al, UAI-04 Exact Solutions to DC-MDPs: Maximization Q-value maximization yields piecewise rectilinear solution Q(move-x-2,x,y) Q(move-y-2,x,y) y y x x max a Q(a,x,y) y = 1 = 0 x

23 Previous Work Limitations I Exact regression when transitions nonlinear? Action move-nonlin: x = x 3 y + y 2 y = y * log(x 2 y) y V (x,y ) = 1 = 0 x y Q(move-nonlin,x,y)? How to compute boundary in closed-form? x

24 Previous Work Limitations II max(.,.) when reward/value arbitrary piecewise? Q(action-1,x,y) Q(action-2,x,y) max y, y x x V(x,y) Closed-form representation for max? y = 1 = 0 x

25 Continuous Actions? If we can solve this, can solve multivariate inventory control closed-form policy unknown for 50+ years!

26 Continuous Actions Inventory control Reorder based on stock, future demand Action: a( ); ~ ~ 2 R jaj Need max in Bellman backup V h+1 = max max Q h+1 a ( ) ~ a2a ~ max x case(x) similar to x case(x) Track maximizing substitutions to recover

27 Max-out Case Operation Like x case(x), reduce to single partition max In a single case partition max w.r.t. critical points LB, UB Derivative is zero (Der0) max( case(x/lb), case(x/ub), case(x/der0) ) ~ = 0 UB See UAI 2011, AAAI 2012 papers for more details Can even track substitutions through max to recover function of maximizing assignments!

28 Illustrative Value and Policy y x Reward V 1 V 2 Value (Policy)

29 Sequential Control Summary Continuous state, action, observation (PO)MDPs Discrete action MDPs UAI-11 Continuous action MDPs (incl. exact policy) AAAI-12b Continuous observation POMDPs NIPS-12 Extensions to general continuous noise In progress

30 Symbolic Constrained Optimization

31 max x case(x) = Constrained Optimization! Conditional constraints E.g., if (x > y) then (y < z) 0-1 MILP, MIQP equivalent Factored / sparse constraints Constraints may be sparse! x 1 > x 2, x 2 > x 3,, x n-1 > x n Dynamic programming for continuous optimization! Parameterized optimization f(y) = max x f(x,y) Maximum value, substitution as a function of y

32 Symbolic Machine Learning

33 Geometric Models: Piecewise Regression How to learn geometric models of objects? Piecewise Convex! arg min x 1 ;y 1 ;:::;x n ;y n X (x;y)!z 2 Images z (x n,y n ) (x 1,y 1 ) ( x x 1 ^ y y 1 ^ : 1 x x 1 ^ y y 1 ^ : 0

34 Optimal Clustering? Use min to make any point snap to nearest center Minimize sum of min distances Piecewise convex! No latent variables

35 What has everyone been missing? Symbolic algebra / calculus? No, they weren t Computer Scientists. case representation blows up need a data structure: XADD.

36 BDD / ADDs Quick Introduction

37 Function Representation (Tables) How to represent functions: B n R? How about a fully enumerated table OK, but can we be more compact? a b c F(a,b,c)

38 Function Representation (Trees) How about a tree? Sure, can simplify. a b c F(a,b,c) Context-specific independence! c a b c

00 1 0 0 0.00 c b Algebraic Decision Diagram (ADD) 1 0 1 1.

39 Function Representation (ADDs) Why not a directed acyclic graph (DAG)? a b c F(a,b,c) a c b Algebraic Decision Diagram (ADD) Think of BDDs as {0,1} subset of ADD range

40 Binary Operations (ADDs) Why do we order variable tests? Enables us to do efficient binary operations a b a c Result: ADD operations can avoid state enumeration a b c 0 2 c

41 Case XADD XADD = continuous variable extension of algebraic decision diagram compact, minimal case representation efficient case operations

42 Case XADD 8 x 1 + k > 100 ^ x 2 + k > 100 : 0 x 1 + k > 100 ^ x 2 + k 100 : x 2 >< x 1 + k 100 ^ x 2 + k > 100 : x 1 V = x 1 + x 2 + k > 100 ^ x 1 + k 100 ^ x 2 + k 100 ^ x 2 > x 1 : x 2 x 1 + x 2 + k > 100 ^ x 1 + k 100 ^ x 2 + k 100 ^ x 2 x 1 : x 1 x 1 + x 2 + k 100 : x 1 + x 2 >:.. *With non-trivial extensions over ADD, can reduce to a minimal canonical form!

43 Compactness of (X)ADDs Linear in number of decisions i Case version has exponential number of partitions!

44 XADD Maximization y > 0 max( y > 0, x > 0 ) = x > 0 x > 0 x y y x x > y May introduce new decision tests x y Operations exploit structure: O( f g )

45 Maintaining XADD Orderings Max may get decisions out of order Decision ordering (root leaf) x > y y > 0 x > 0 max( y > 0, x > 0 ) = x y y Newly introduced node is out of order! x y > 0 x > 0 x > 0 x > y x y

46 Maintaining XADD Orderings Substitution may get decisions out of order Decision ordering (root leaf): x > y x > z y > 0 ={ z/y } y > 0 x > z = x > y x > y y > 0 x > z x z x y x y x y Substituted nodes are now out of order!

47 Correcting XADD Ordering Obtain ordered XADD from unordered XADD key idea: binary operations maintain orderings z is out of order z result will have z in order! ID 1 z 1 0 ID 0 z 0 1 ID 1 ID 0 Inductively assume ID 1 and ID 0 are ordered. All operands ordered, so applying, produces ordered result!

48 Maintaining Minimality y > 0 y > 0 x > 0 x > 0 x + y < 0 y x y x + y x + y Node unreachable x + y < 0 always false if x > 0 & y > 0 If linear, can detect with feasibility checker of LP solver & prune More subtle prunings as well.

49 XADD Makes Possible all Previous Inference Could not even do a single integral or maximization without it!

50 Open Problems

51 Continuous Actions, Nonlinear Robotics Continuous position, joint angles Represent exactly with polynomials Radius constraints Obstacle Navigation 2D, 3D, 4D (time) Don t discretize! Grid worlds But nonlinear y h=1 h= Multilinear, quadratic extensions. In general: algebraic geometry. x Solve in 2 steps!

52 Open Problems Bounded (interval) approximation This XADD has > 1000 nodes!

53 Recap Defined a calculus for piecewise functions f 1 f 2, f 1 f 2 max( f 1, f 2 ), min( f 1, f 2 ) x f(x) max x f(x), min x f(x) Defined XADD to efficiently compute with cases Makes possible Exact inference in continuous graphical models New paradigms for optimization and sequential control New formalizations of machine learning problems

54 Symbolic Piecewise Calculus + XADD = Expressive Continuous Inference, Optimization, & Control Thank you! Questions?

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact