Approximating Probabilistic Inference

Similar documents
Constrained Counting and Sampling Bridging Theory and Practice

Machine Learning and Logic: Fast and Slow Thinking

The SAT Revolution: Solving, Sampling, and Counting

Efficient Sampling of SAT Solutions for Testing

Constrained Sampling and Counting

Logic and Bayesian Networks

Quantified Boolean Formulas Part 1

Hashing-Based Approximate Probabilistic Inference in Hybrid Domains: An Abridged Report

Hashing-Based Approximate Probabilistic Inference in Hybrid Domains: An Abridged Report

Model Counting for Probabilistic Reasoning

Leveraging Belief Propagation, Backtrack Search, and Statistics for Model Counting

CS 188: Artificial Intelligence. Bayes Nets

From Weighted to Unweighted Model Counting

The Automated-Reasoning Revolution: from Theory to Practice and Back

CS 188: Artificial Intelligence Spring Announcements

On Parallel Scalable Uniform SAT Witness Generation???

Reasoning with Bayesian Networks

Bayesian networks. Chapter Chapter

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

A Scalable Approximate Model Counter

Uncertainty and Bayesian Networks

arxiv: v3 [cs.ai] 9 Feb 2016

Directed Graphical Models

UCLID: Deciding Combinations of Theories via Eager Translation to SAT. SAT-based Decision Procedures

Cardinality Networks: a Theoretical and Empirical Study

CS 188: Artificial Intelligence Spring Announcements

Bayes Nets III: Inference

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Bayesian Networks BY: MOHAMAD ALSABBAGH

Introduction to SAT (constraint) solving. Justyna Petke

Bayes Nets: Independence

SAT, CSP, and proofs. Ofer Strichman Technion, Haifa. Tutorial HVC 13

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Approximate Probabilistic Inference via Word-Level Counting

LOGIC PROPOSITIONAL REASONING

NP-Hardness reductions

A brief introduction to Logic. (slides from

A brief introduction to practical SAT solving

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Propositional Logic. Methods & Tools for Software Engineering (MTSE) Fall Prof. Arie Gurfinkel

Verification using Satisfiability Checking, Predicate Abstraction, and Craig Interpolation. Himanshu Jain THESIS ORAL TALK

CS 5522: Artificial Intelligence II

Complexity, P and NP

Graphical Models - Part I

On Hashing-Based Approaches to Approximate DNF-Counting

MiniMaxSat: : A new Weighted Solver. Federico Heras Javier Larrosa Albert Oliveras

CSE 473: Artificial Intelligence Autumn 2011

Probabilistic Representation and Reasoning

Quantifying uncertainty & Bayesian networks

PROBABILISTIC REASONING SYSTEMS

VLSI CAD: Lecture 4.1. Logic to Layout. Computational Boolean Algebra Representations: Satisfiability (SAT), Part 1

Artificial Intelligence

Efficient Haplotype Inference with Boolean Satisfiability

Bayesian networks. Chapter 14, Sections 1 4

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

From SAT To SMT: Part 1. Vijay Ganesh MIT

CS151 Complexity Theory. Lecture 13 May 15, 2017

Probabilistic Reasoning Systems

Refresher on Probability Theory

A Scalable and Nearly Uniform Generator of SAT Witnesses

Artificial Intelligence Bayes Nets: Independence

Lecture 12: Interactive Proofs

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Introduction to Advanced Results

Lecture Notes CS:5360 Randomized Algorithms Lecture 20 and 21: Nov 6th and 8th, 2018 Scribe: Qianhang Sun

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Particle-Based Approximate Inference on Graphical Model

PCP Theorem and Hardness of Approximation

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

Propositional Logic: Evaluating the Formulas

1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT

SAT-based Model-Checking

Challenge to L is not P Hypothesis by Analysing BP Lower Bounds

An Incremental Approach to Model Checking Progress Properties

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics

Exact Max 2-SAT: Easier and Faster. Martin Fürer Shiva Prasad Kasiviswanathan Pennsylvania State University, U.S.A

Bayesian Networks. Probability Theory and Probabilistic Reasoning. Emma Rollon and Javier Larrosa Q

Algorithm Design and Analysis

Complexity (Pre Lecture)

Computational Complexity

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Logic in AI Chapter 7. Mausam (Based on slides of Dan Weld, Stuart Russell, Dieter Fox, Henry Kautz )

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Which coin will I use? Which coin will I use? Logistics. Statistical Learning. Topics. 573 Topics. Coin Flip. Coin Flip

Foundations of Artificial Intelligence

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011

CSEP 573: Artificial Intelligence

Informatics 2D Reasoning and Agents Semester 2,

Generating SAT Instances with Community Structure

StarAI Full, 6+1 pages Short, 2 page position paper or abstract

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Artificial Intelligence

An Introduction to Z3

Computational Complexity of Bayesian Networks

Proof Complexity of Quantified Boolean Formulas

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Transcription:

Approximating Probabilistic Inference Kuldeep S. Meel PhD Student CAVR Group Joint work with Supratik Chakraborty (IITB), Daniel J. Fremont (UCB), Sanjit A. Seshia (UCB), Moshe Y. Vardi (Rice) 1

IoT: Internet of Things 2

The Era of Data How to make inferences from data 3

Probabilistic Inference Given that Mary (aged 65) called 911, what is the probability of the burglary in the house? Pr [event evidence] 4

Probabilistic Inference Given that Mary (aged 65) called 911, what is the probability of the burglary in the house? Pr [event evidence] 5

Probabilistic Inference Given that Mary (aged 65) called 911, what is the probability of the burglary in the house? Pr [event evidence] 6

Graphical Models!! Bayesian Networks 7

Burglary B Earthquake E Alarm A MaryWakes M PhoneWorking P 8 C Call

Burglary B Earthquake E Alarm A MaryWakes M PhoneWorking P 9 C Call

T 0.05 F 0.95 Burglary B Earthquake E T 0.01 F 0.99 B E A Pr Alarm A T T T 0.88 T T F 0.12 T F T 0.91 T F F 0.09 F T T 0.97 F T F 0.03 F F T 0.02 What is Pr [ Burglary Call ]? F F F 0.98 MaryWakes PhoneWorking A M Pr T T 0.7 T F 0.3 F T 0.1 F F 0.9 M 10 C P Call T 0.8 F 0.2 M P C Pr T T T 0.99 T T F 0.01 T F T 0 T F F 1 F T T 0 F T F 1 F F T 0.1 F F F 0.9

Bayes Rule to the Rescue Pr[Burglary Call]= Pr[Burglary \ Call] Pr[Call] Pr[Burglary \ Call]=Pr[B, E, A, M, P, C]+Pr[B,Ē, A, M, P, C]+ 11

Burglary Earthquake T 0.05 F 0.95 B E T 0.01 F 0.99 B E A Pr Alarm Pr [B,E,A,M,P,C] A T T T 0.88 T T F 0.12 T F T 0.91 T F F 0.09 F T T 0.97 F T F 0.03 F F T 0.02 F F F 0.98 MaryWakes PhoneWorking = Pr[B] Pr[E] Pr[A B,E] Pr[M A] Pr[C M,P] A M Pr T T 0.7 M P T 0.8 F 0.2 T F 0.3 F T 0.1 M P C Pr T T T 0.99 F F 0.9 T T F 0.01 T F T 0 T F F 1 12 C Call F T T 0 F T F 1 F F T 0.1 F F F 0.9

Burglary Earthquake T 0.05 F 0.95 B E T 0.01 F 0.99 B E A Pr Alarm Pr [B, Ē,A,M,P,C] MaryWakes A T T T 0.88 T T F 0.12 T F T 0.91 T F F 0.09 F T T 0.97 F T F 0.03 F F T 0.02 F F F 0.98 PhoneWorking A M Pr T T 0.7 M P T 0.8 F 0.2 T F 0.3 F T 0.1 M P C Pr T T T 0.99 F F 0.9 T T F 0.01 T F T 0 T F F 1 13 C Call F T T 0 F T F 1 F F T 0.1 F F F 0.9

So we are done? We can enumerate all paths Compute probability for every path Sum up all the probabilities

Where is the catch? Exponential number of paths

Scalability Prior Work BP, MCMC C? f C = f Exact Methods Quality/Guarantees 16

Scalability Our Contribution BP, MCMC C? f Approximation Guarantees WeightMC C = f Exact Methods Quality/Guarantees 17

Approximation Guarantees Input: ), Output: C Pr[ f 1+ apple C apple f(1 + )] 1

An Idea for a new paradigm? Partition space of paths into small and equal weighted cells # of paths in a cell is not large (bounded by a constant) equal weighted : All the cells have equal weight 19

Outline Reduction to SAT Weighted Model Counting Looking forward 20

Boolean Satisfiability SAT: Given a Boolean formula F over variables V, determine if F is true for some assignment to V F = (a b) R F = {(0,1),(1,0),(1,1)} SAT is NP-Complete (Cook 1971) 21

Model Counting Given:! CNF Formula F, Solution Space: R F Problem (MC): What is the total number of satisfying assignments (models) i.e. R F? Example F = (a b); R F = {[0,1], [1,0], [1,1]}! R F = 3

Weighted Model Counting Given: CNF Formula F, Solution Space: R F Weight Function W(.) over assignments Problem (WMC): What is the sum of weights of satisfying assignments i.e. W(R F )?! Example F = (a b) R F = {[0,1], [1,0], [1,1]} W([0,1]) = W([1,0]) = 1/3 W([1,1]) = W([0,0]) = 1/6! W(R F ) = 1/3 + 1/3 + 1/6 = 5/6

Weighted SAT Boolean formula F Weight function over variables (literals) Weight of assignment = product of wt of literals F = (a b); W(a=0) = 0.4; W(a = 1) = 1-0.4 = 0.6 W(b=0) = 0.3; W(b = 1) = 0.7 W[(0,1)] = W(a = 0) W(b = 1) = 0.4 0.7 = 0.28 24

Reduction to W-SAT Bayesian Network SAT Formula Nodes Variables Rows of CPT Variables Probabilities in CPT Weights Event and Evidence Constraints 25

Reduction to W-SAT Every satisfying assignment = A valid path in the network Satisfies the constraint (evidence) Probability of path = Weight of satisfying assignment = Product of weight of literals = Product of conditional probabilities Sum of probabilities = Weighted Sum 26

Why SAT? SAT stopped being NP-complete in practice! zchaff (Malik, 2001) started the SAT revolution SAT solvers follow Moore s law 27

Speed-up of 2012 solver over other solvers 1,000 Speed-up (log scale) 100 10 1 Grasp (2000) zchaff (2001) BerkMin (2002-03) zchaff (2003-04) Siege (2004) Minisat + SatElite (2005) Minisat2 (2006) Rsat + SatElite (2007) Solver Precosat (2009) Cryptominisat (2010) Glucose 2.0 (2011) Glucose 2.1 (2012) 28

Why SAT? SAT stopped being NP-complete in practice! zchaff (Malik, 2001) started the SAT revolution SAT solvers follow Moore s law Symbolic Model Checking without BDDs : most influential paper in the first 20 years of TACAS A simple input/output interface 29

Riding the SAT revolution Probabilistic Inference SAT

Where is the catch? Model counting is very hard (#P hard) #P: Harder than whole polynomial hierarchy Exact algorithms do not scale to large formulas Approximate counting algorithms do not provide theoretical guarantees

Outline Reduction to SAT Approximate Weighted Model Counting Looking forward 32

Counting through Partitioning 33

Counting through Partitioning 34

Counting through Partitioning Pick a random cell Estimated Weight of solutions= Weight of the cell * total # of cells 35

Approximate Counting Partitioning 690 710 730 730 731 831. 834 t 36

Scaling the confidence Algorithm Median 690 710 730 730 731 831. 834 t 37

How to Partition? How to partition into roughly equal small cells of solutions without knowing the distribution of solutions? 3-Universal Hashing [Carter-Wegman 1979, Sipser 1983] 38

XOR-Based Hashing 3-universal hashing Partition 2 n space into 2 m cells Variables: X 1, X 2, X 3,, X n Pick every variable with prob. ½, XOR them and equate to 0/1 with prob. ½ X 1 +X 3 +X 6 + + X n-1 = 0 m XOR equations 2 m cells 39

Counting through Partitioning 40

Partitioning How large the cells should be? 41

Size of cell Too large => Hard to enumerate Too small => Variance can be very high More tight bounds => larger cell pivot = 5(1 + 1/") 2 42

Dependence on distribution Normalized weight of a solution y = W(y)/W max! Maximum weight of a cell = pivot! Maximum # of solutions in cell = pivot*w max /W min! Tilt = Wmax/Wmin 43

Strong Theoretical Guarantees Approximation: WeightMC( B,, ), returns C s.t.! f Pr[ apple C apple f(1 + )] 1 1+! Complexity: # of calls to SAT solver is linear in and polynomial in log 1, F, 1/" 44

Handling Large Tilt.001.001.001.001.002.001.002.002.992 Tilt: 992 45

Handling Large Tilt Requires.001 Pseudo-Boolean.001 solver:.002.001.50 wt < 1 Still a SAT problem not Optimization.001.001.002.002.001 wt <.002.992 Tilt: 992 Tilt for each region: 2 46

Main Contributions Novel parameter, tilt ( ρ ), to characterize complexity ρ = W max / W min over satisfying assignments Small Tilt ( ρ ) Efficient hashing-based technique requires only SAT solver Large Tilt ( ρ ) Divide-and-conquer using Pseudo-Boolean solver 47

Strong Theoretical Guarantees Approximation: WeightMC( B,, ), returns C s.t.! f Pr[ apple C apple f(1 + )] 1 1+! Complexity: # of calls to SAT solver is linear in log and polynomial in log 1, F, 1/" 48

Significantly Faster than SDD Run Time (seconds) 65536 16384 4096 1024 256 64 16 4 1 WeightMC SDD 0.25 or-50 or-60 s526a32 s526157 s953a32 s1196a157 s1238a74 Squaring1 Squaring7 LoginService2 Sort Karatsuba EnqueueSeq TreeMax LLReverse Benchmarks # of variables > 49

Mean Error: 4% (Allowed: 80%) Weighted Count 100000 10000 1000 100 10 1 0.1 0.01 0.001 WeightMC SDD*1.8 SDD/1.8 0 5 10 15 20 25 Benchmarks 50

Outline Reduction to SAT Weighted Model Counting Looking forward 51

Distribution-Aware Sampling Given: CNF Formula F, Solution Space: R F Weight Function W(.) over assignments! Problem (Sampling): Pr (Solution y is generated) = W(y)/W(R F )! Example: F = (a b); R F = {[0,1], [1,0], [1,1]}! W( [0,1] ) = W([1,0]) = 1/3 W([1,1]) = W([0,0]) = 1/6! Pr ([0,1] is generated]) = (1/3) / (5/6) = 2/5 52

Partitioning into equal (weighted) small cells 53

Partitioning into equal (weighted) small cells Pick a random cell Pick a solution according to its weight 54

Sampling Distribution 180 160 140 Frequency 120 100 80 60 WtGen IdealGen 40 20 0 247 271 291 311 331 351 371 391 411 431 451 471 491 512 #Solutions Benchmark: case110.cnf; #var: 287; #clauses: 1263 Total Runs: 4x10 6 ; Total Solutions : 16384 55

Sampling Distribution 180 160 140 Frequency 120 100 80 60 WtGen IdealGen 40 20 0 247 271 291 311 331 351 371 391 411 431 451 471 491 512 #Solutions Benchmark: case110.cnf; #var: 287; #clauses: 1263 Total Runs: 4x10 6 ; Total Solutions : 16384 56

Classification What kind of problems have small tilt? How to predict tilt?

Tackling Tilt What kind of problems have low tilt? How to handle CNF+PBO+XOR Current PBO solvers can t handle XOR SAT solver can t handle PBO queries

Extension to More Expressive Domains (SMT, CSP) Efficient 3-independent hashing schemes Extending bit-wise XOR to SMT provides guarantees but no advantage of SMT progress! Solvers to handle F + Hash efficiently CryptoMiniSAT has fueled progress for SAT domain Similar solvers for other domains? 59

Conclusion Inference is key to the Internet of Things (IoT) Current inference methods either do not scale or do not provide any approximation guarantees A novel scalable approach that provides theoretical guarantee of approximation Significantly better than state-of-the-art tools Exciting opportunities ahead! 60

To sum up. 61

Collaborators 62

EXTRA SLIDES 63

Complexity! Tilt captures the ability of hiding a large weight solution. Is it possible to remove tilt from complexity? 64

Exploring CNF+XOR Very little understanding as of now! Can we observe phase transition?! Eager/Lazy approach for XORs?! How to reduce size of XORs further? 65

Outline Reduction to SAT Partition-based techniques via (unweighted) model counting Extension to Weighted Model Counting Discussion on hashing Looking forward 66

XOR-Based Hashing 3-universal hashing Partition 2 n space into 2 m cells Variables: X 1, X 2, X 3,.., X n Pick every variable with prob. ½,XOR them and equate to 0/1 with prob. ½ X 1 +X 3 +X 6 +. X n-1 = 0 (Cell ID: 0/1) m XOR equations -> 2 m cells The cell: F && XOR (CNF+XOR) 67

XOR-Based Hashing CryptoMiniSAT: Efficient for CNF+XOR Avg Length : n/2 Smaller the XORs, better the performance!! How to shorten XOR clauses? 68

Independent Variables Set of variables such that assignments to these uniquely determine assignments to rest of variables for formula to be true (a V b = c) Independent Support: {a, b} # of auxiliary variables introduced: 2-3 orders of magnitude Hash only on the independent variables (huge speedup) 69