Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

Size: px

Start display at page:

Download "Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers"

Cecily Chapman
5 years ago
Views:

1 Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints Hadi Mohasel Afshar Scott Sanner Christfried Webers

Inference in Hybrid Graphical Models / Probabilistic Programs Limitations of BUGS, PyMC, Anglican, and STAN They don t handle piecewise functions well i.e., slow convergence with conditionals if (t > 0) They don t handle simple algebraic constraints i.

2 Inference in Hybrid Graphical Models / Probabilistic Programs Limitations of BUGS, PyMC, Anglican, and STAN They don t handle piecewise functions well i.e., slow convergence with conditionals if (t > 0) They don t handle simple algebraic constraints i.e., you cannot assign x = y + 1 (you have to add noise) Our solution efficiently handles piecewise functions and algebraic constraints

3 Sneak Preview: Sampling in Piecewise Models with Algebraic Constraints

$fractional functions (PPFs) Fractional Polynomial$

4 Contribution 1: We present an effective sampler for GMs with piecewise factors Polynomial-piecewise fractional functions (PPFs) Fractional Polynomial Example: x y if x + y > 0 x 2 +y y 2 if x + y < 0, y > 0

5 PPFs can be used for: Representing truncated/finite support models Approximating arbitrary models

6 PPFs can be used for: Representing truncated/finite support models Approximating arbitrary models Probabilistic programming % Draw from uniform (0, 1) x = rand; if (x < 0.5) % Draw from standard Normal y = randn; else % Draw from Gamma(1, 1) y = randg + 2.0; end

7 PPFs can be used for: Representing truncated/finite support models Approximating arbitrary models Probabilistic programming Bayesian inference: piecewise priors and likelihoods Algebraic constraints!

8 Contribution 2: Algebraic Constraints An example: pr M 1 = U(M 1 ; 0.1, 2.1) pr M 2 = U(M 2 ; 0.1, 2.1) pr V 1 = U(V 1 ; 2, 2) pr V 2 V 1 = U(V 2 ; 2, V 1 ) Observation: P tot = 3 pr(v 1, M 2, V 2 P tot = 3)? Query: M 1 V 1 + M 2 V 2 = 3 pr V 1, M 2, V 2 P tot = 3 pr M 1, V 1, M 2, V 2 pr P tot M 1, V 1, M 2, V 2 dm 1 δ f x = r roots f x δ x r f x x P tot = M 1 V 1 + M 2 V 2, so: pr P tot = δ M 1 V 1 + M 2 V 2 3 = δ M 1 3 M 2V 2 V 1 V 1

$1, V 1, M 2, V 2 dm 1 Next: PPFs are closed under fractional substitutions!$

9 Contribution 2: Algebraic Constraints An example: pr M 1 = U(M 1 ; 0.1, 2.1) pr M 2 = U(M 2 ; 0.1, 2.1) pr V 1 = U(V 1 ; 2, 2) pr V 2 V 1 = U(V 2 ; 2, V 1 ) Observation: P tot = 3 Query: Divisions, absolute values! pr(v 1, M 2, V 2 P tot = 3)? PPFs are closed under them pr V 1, M 2, V 2 P tot = 3 pr M 1, V 1, M 2, V 2 pr P tot M 1, V 1, M 2, V 2 dm 1 Next: PPFs are closed under fractional substitutions! δ f x = r roots f x δ x r f x x P tot = M 1 V 1 + M 2 V 2, so: pr P tot = δ M 1 V 1 + M 2 V 2 3 = δ M 1 3 M 2V 2 V 1 V 1

10 Collapse Algebraic Constraints Collapse out M 1 : pr M 1 = U(M 1 ; 0.1, 2.1) pr M 2 = U(M 2 ; 0.1, 2.1) pr V 1 = U(V 1 ; 2, 2) pr V 2 V 1 = U(V 2 ; 2, V 1 ) Observation: P tot = 3 that is, M 1 V 1 + M 2 V 2 = 3 Query: pr(v 1, M 2, V 2 P tot = 3)? pr P tot = δ M 1 3 M 2V 2 V 1 V 1 pr V 1, M 2, V 2 P tot = 3 pr M 1, V 1, M 2, V 2 pr P tot M 1, V 1, M 2, V 2 dm 1 1 if 0 < V 1, 0.1 < 3 M 2V 2 < 2.1, 1 < M P tot = M 1 V 1 V+ M 2 V 2, so: 2 < 3, V 1 (V 1 + 2) 1 2 < V pr P 1 < 2, 2 < V tot = δ M 1 V 1 + M 2 < V 2 V if 0 > V δ M 1 3 M 1, 0.1 < 3 M 2V 2 < 2.1, 1 < 2VM 2 V 2 < 3, V V = 1 1 (V 1 + 2) 1 2 < V 1 < 2, 2 < V 21 < V 1 0 otherwise

11 Sneak Preview: Inference Results Even in this simple example, posteriors are multimodal and piecewise! pr M 1, V 1 P tot = 3, V 2 = 0.2) pr V 1, V 2 P tot = 3, M 1 = 2) pr M 1, V 1 P tot = 3) pr V 1, V 2 P tot = 3)

12 Where are we? We ve written down expressive PPF models We ve collapsed out determinism (constraints) We still need to do inference in the collapsed PPF model

13 Inference in Piecewise Algebraic Models Closed-form solution? Metropolis Hastings? Generally, impossible! Low acceptance rate! Due to high KL-divergence between the proposal and target densities

Hamiltonian Monte Carlo? Low acceptance rate!

14 Inference in Piecewise Algebraic Models Closed-form solution? Generally, impossible! Metropolis Hastings? Low acceptance rate! Hamiltonian Monte Carlo? Low acceptance rate! Also no discrete variables! Since HMC leap-frog mechanism relies one the assumption of smoothness.

15 Inference in Piecewise Algebraic Models Closed-form solution? Metropolis Hastings? Hamiltonian Monte Carlo? Slice Sampling? Gibbs sampling? Generally, impossible! Low acceptance rate! Low acceptance rate! Poor performance on multimodal densities! Slow, due to per sample (multiple) CDF computation (integration)! We are going to make it fast!

16 Gibbs sampling Remember that in Gibbs, sampling from an n dimensional function is done in n steps. Current sample (X 1 = a, X 2 = b) Intermediate sample (X 1 = a, X 2 = b) X 2 X 1 ~ pr(x 1 X 2 = b) X 2 ~ pr(x 2 X 1 = a ) n univariate CDFs = n integrations per sample Next sample (X 1 = a, X 2 = b ) X 1

Gibbs sampling Remember that in Gibbs, sampling from an N dimensional What function if we can is compute done in N steps.idea Current sample (X 1 = a, X 2 = b) IDEA!

17 Gibbs sampling Remember that in Gibbs, sampling from an N dimensional What function if we can is compute done in N steps.idea Current sample (X 1 = a, X 2 = b) IDEA! CDFs symbolically and prior Intermediate sample (X 1 = a, X 2 = b) to sampling? X 2 X 1 ~ pr(x 1 X 2 = b) X 2 ~ pr(x 2 X 1 = a ) n univariate CDFs = n integrations per sample Next sample (X 1 = a, X 2 = b ) X 1

Only need to do one integral which is possible for a large

18 Gibbs sampling Remember that in Gibbs, sampling from an n dimensional function is done in n steps. Only need to do one integral which is possible for a large class of PPFs Mapping variables to symbolic CDFs X 2 X 1 n integrals rather than n #samples

19 Returning to our example: pr V 1, M 2, V 2 P tot = 3 1 V 1 (V 1 + 2) if 0 < V 1, V 1 < 30 10M 2 V 2, 3 M 2V 2 < V 2.1 1, 1 < M 2 < 3, 2 < V 1, V 1 < 2, 2 < V 2, V 2 < V 1 1 if 0 > V 1, V 1 > 30 10M 2 V 2, 3 M 2V 2 > V 2.1 1, 1 < M 2 < 3, V 1 (V 1 + 2) 2 < V 1 < 2, 2 < V 2 < V 1 0 otherwise, so: pr P tot = δ M 1 V 1 + M 2 V 2 3 = δ M 1 3 M 2V 2 V 1 V 1

20 Returning to our example: V 1 maps F V1 v 1 = v 1 V 1 = pr V 1, M 2, V 2 P tot = 3 v 1 V 1 = v 1 V 1 = + 1 V 1 (V 1 + 2) dv 1 Let s just consider one statement if 0 < V 1, V 1 < 30 10M 2 V 2, 3 M 2V 2 < V 2.1 1, 1 < M 2 < 3, 2 < V 1, V 1 < 2, 2 < V 2, V 2 < V 1 1 if 0 > V 1, V 1 > 30 10M 2 V 2, 3 M 2V 2 > V 2.1 1, 1 < M 2 < 3, V 1 (V 1 + 2) 2 < V 1 < 2, 2 < V 2 < V 1 0 otherwise, so: dv 1 dv 1

21 Returning to our example: Let s just consider one statement v 1 V 1 = 1 V 1 (V 1 + 2) if 0 < V 1, V 1 < 30 10M 2 V 2, 3 M 2V 2 < V 2.1 1, 1 < M 2 < 3, 2 < V 1, V 1 < 2, 2 < V 2, V 2 < V 1 dv 1, so:

22 Returning to our example: v 1 V 1 = 1 V 1 (V 1 + 2) if 0 < V 1, V 1 < 30 10M 2 V 2, 3 M 2V 2 < V 2.1 1, 1 < M 2 < 3, 2 < V 1, V 1 < 2, 2 < V 2, V 2 < V 1 dv 1 = min v 1,30 10 M 2 V 2, 2 V 1 =max {0,V 2, 3 M 2V 2 } if min {v dv 1, M 2 V 2, 2} > max 0, V 2, 3 M 2V 2 V 1 (V 1 + 2) < M 2 < 3, 2 < V 2 0 otherwise, A large set of algebraic functions have closed-form indefinite integrals i.e. here dv1 V1(V1+2) = log V 1 log (V 1 +2) 2

23 Inference in Piecewise Algebraic GMs 1. Collapse determinism Collapse one variable in each algebraic constraint 2. To take S samples from an N-dimensional model, In baseline Gibbs, S N (univariate) CDFs are computed. In Symbolic Gibbs, N (analytical) CDFs and S N function evaluations are required. Much faster!

24 Results

25 Results

26 Conclusions Expressive Graphical Models / Probabilistic Programs Allow algebraic constraints Represent factors as polynomial-piecewise fractions (PPFs) Sufficient for rich class of probabilistic programs Among many other applications Result 1: Collapse all algebraic constraints (determism) Yields symbolic substitutions into PPF form Result 2: PPFs are one-time integrable! Symbolically pre-compute all conditions for Gibbs sampling Leads to very fast Gibbs sampler! Expressive Exact GM / PP MCMC Inference!

19 : Slice Sampling and HMC

10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often