Efficient Haplotype Inference with Boolean Satisfiability

Size: px
Start display at page:

Download "Efficient Haplotype Inference with Boolean Satisfiability"

Transcription

1 Efficient Haplotype Inference with Boolean Satisfiability Joao Marques-Silva 1 and Ines Lynce 2 1 School of Electronics and Computer Science University of Southampton 2 INESC-ID/IST Technical University of Lisbon 22 March 2006

2 The HIPP Problem Assume a set G of n strings over the alphabet {0, 1, 2}, each with m characters Each character j in a string g i represented by g i j Example: g i = 012 A string g i is explained by two strings over the alphabet {0, 1}, h a and h b, iff: If g i j = 0, then h a j = h b j = 0 If g i j = 1, then h a j = h b j = 1 If g i j = 2, then h a j h b j Example: g i = 012 is explained by h a = 010 and h b = 011 Our goal is to compute a minimum-size set H of strings over the alphabet {0, 1} such that every string in G is explained by two strings in H Problem is NP-Hard (it is also APX-Hard)

3 An Example of the HIPP Problem Strings to be explained:

4 An Example of the HIPP Problem Strings to be explained: = = = = = = = = = = = = = = = = = = HIPP solution has size 6

5 In This Talk Relate the problem of haplotype inference from genotype data with the HIPP problem Propose a simple Boolean satisfiability (SAT) model for the HIPP problem Compact model Very efficient solution; by far the most efficient solution to the HIPP problem Modern SAT algorithms are extremely effective on the HIPP problem

6 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

7 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

8 Haplotype Inference Single Nucleotide Polimorphism (SNPs): DNA sequence variation, occurs when a nucleotide (typically A, C, G or T) changes among elements of the same species Haplotypes: Encode bi-allelic Single Nucleotide Polimorphisms (SNPs) Each site of a haplotype (describing a SNP) can take value 0 (the wild type) or 1 (the mutant) Genotypes: Typically available instead of haplotypes Each genotype describes two haplotypes Each site of a genotype can take value 0, 1 or 2 If site is 0 or 1, site is homozygous, and the two haplotypes must coincide at this site If site is 2, site is heterozygous, and the two haplotypes must differ at this site Haplotype Inference: Identify set of haplotypes which explain set of genotypes

9 Haplotype Inference N tide Alleles G/A C/A G/A C/G C/T T/C T/C T/C G/A C/G G/A C/T C/A h 1 A C G C C T T T A C G C C h 2 A C G G C C C C G G G C C h 3 G A A C C T T T A C G C C h 4 G C A C C T T T A C G C C h 5 G C A C C T T T G C G C C h 6 G C G C C T T T G C A C A h 7 G C G C C T T T G C A T A h 8 G C A C C T T T A C A C A h 9 A C G C T T T T A C G C C h 10 G C G C C T T T G C A C C h 11 G C G C C T T T G C G C C h 12 A C G G C T T T A C G C C Genotype g = is explained by haplotypes h 7 = and h 8 =

10 Why Haplotype Inference? Haplotype map of the human genome (see HapMap project) Mapping complex disease genes Inferring population histories Designing new drugs

11 Haplotype Inference by Pure Parsimony (HIPP) The pure parsimony criterion: Explain set of genotypes G with the least number of haplotypes Biological motivation; use the least number of entities that are required to explain natural phenomena Example: explain 2120, 2102, and 1221 A possible solution (using 6 haplotypes): 2120 = = = A pure parsimony solution (using 4 haplotypes): 2120 = = =

12 The HIPP Problem Revisited Assume a set G of n genotypes, each with m sites A genotype g i is explained by two haplotypes h a and h b iff: If g i j = 0, then h a j = h b j = 0 If g i j = 1, then h a j = h b j = 1 If g i j = 2, then h a j h b j Our goal is to compute a minimum-size set H of haplotypes such that every genotype in G is explained by two haplotypes in H Problem is NP-Hard (it is also APX-Hard)

13 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

14 A Naive ILP Model I For each genotype g i enumerate all pairs of haplotypes that can explain g i With each pair of haplotypes associate a Boolean variable y ir, denoting whether the pair is selected for explaining g i Clearly, for each genotype g i one pair must be selected y ir = 1 Associate a Boolean variable x k with each haplotype h k, denoting whether h k is selected Total number of x variables equals number of candidate haplotypes; Can grow exponentially with the number of genotypes r

15 A Naive ILP Model II If a pair of haplotypes is selected (i.e. y ir = 1), then associated haplotypes must be selected If the pair of haplotypes for variable y ir includes h k, then the condition becomes y ir x k Objective is to minimize number of haplotypes selected min k x k Space complexity is exponential on the number of genotypes E.g., explained by pairs of haplotypes There are ILP models with polynomial space complexity

16 An Example g 1 = 212 g 2 = 021 x 1 x 2 (010, 111) (011, 110) x 3 x 4 (001, 011) y 11 y 12 y 21 Constraints: y 11 + y 12 = 1 y 21 = 1 y 11 x 1 y 11 x 2 y 12 x 3 y 12 x 4 y 21 x 5 y 21 x 3 x 5 x 3 Cost function: min 5 i=1 x i

17 Hapar A Branch-and-Bound Solution Also based on enumeration of candidate pairs of haplotypes (as in the naive ILP model) Uses greedy approach used for computing an upper bound on the number of required haplotypes For each genotype select pair of haplotypes each of which explains the largest number of genotypes Computed upper bound usually close to optimum Bounding procedure uses this upper bound A number of pruning techniques for eliminating irrelevant pairs of haplotypes Our results indicate that (until now) Hapar was the most efficient approach for the HIPP problem

18 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

19 SAT-based Haplotype Inference by Pure Parsimony Clearly, the number of haplotypes lies between 1 and 2 n, 1 H 2 n It is in general possible to find tighter lower and upper bounds on the number of haplotypes The SAT model iteratively considers increasing numbers of candidate haplotypes Starts from the lower bound ub A trivial value is 1 Terminates for a value of k when all genotypes can be explained by k haplotypes Guaranteed to terminate until k = 2 n

20 Model for r Candidate Haplotypes I h 1 h 11 h 1m s a 1i s a ri g i1 g im g i s b 1i h r h r1 h rm s b ri Must select two haplotypes for explaining each genotype g i g variables only needed for heterozygous sites Candidate haplotypes represented with r m Boolean variables (h) Selector variables represented with 2 r n Boolean variables (s)

21 Model for r Candidate Haplotypes II Conditions on sites: (with 1 k r) If g ij = 0, add clauses ( h kj s a ki ) ( h kj s b ki ) If h k is used for explaining g i, then h kj = g ij = 0 If g ij = 1, add clauses (h kj s a ki ) (h kj s b ki ) If h k is used for explaining g i, then h kj = g ij = 1 If g ij = 2: Add variables gij a and gij b, and require gij a gij b : (g a ij g b ij ) ( g a ij g b ij ) Add clauses relating h and g variables: (h kj g a ij s a ki) ( h kj g a ij s a ki) (h kj g b ij s b ki ) ( h kj g b ij s b ki )

22 Model for r Candidate Haplotypes III Conditions on selector variables: Exactly one haplotype (for a and for b) is selected for each genotype g i : ( r ) ( r ) ski a = 1 ski b = 1 k=1 Space complexity of the model: Worst-case (r = Θ(n)): O(n 2 m) In practice (r = O(n)): O(n r m) In practice SAT model significantly more compact than ILP models Performance of the model: Not competitive with the best of previous solution, Hapar Need to develop optimizations to the basic model k=1

23 Optimizations I Structural Simplifications There can exist duplicate genotypes If two genotypes are identical, they can be explained by the same pair of haplotypes Eliminate duplicate genotypes; Reconstruct eliminated genotypes from computed haplotypes There can exist globally duplicated and globally complemented sites: Duplicated column Complemented column Eliminate duplicate/completed columns sites; Reconstruct eliminated columns from computed haplotypes

24 Optimizations II Lower Bounds I Two genotypes g a and g b are incompatible if at a given site j one has value 0 and the other has value 1 g a = 012 is incompatible with g b = 102 Can compute clique of mutually incompatible genotypes E.g., g a = 012, g b = 102 and g c = 110 are mutually incompatible, and form a clique of size

25 Optimizations II Lower Bounds II Use clique for computing a lower bound on the number of required haplotypes If genotype in clique has no heterozygous sites, then contribution to the lower bound is 1 Otherwise, each genotype in clique contributes 2 to the lower bound (2) (2) (1) E.g., for g a = 012, g b = 102 and g c = 110, the computed lower bound is 5 Can further improve lower bound (see paper(s))

26 Optimizations III Breaking Symmetries Problem formulation has key symmetries: Haplotypes h k1 and h k2 Selector variables s a k 1i, sa k 2i, sb k 1i and s b k 2i Boolean valuations v x and v y to the sites of haplotypes h k1 and h k2, e.g. h vx k 1 or h vy k 2 h vx k 1 and h vy k 2, with sk a 1i sa k 2i sb k 1i sb k 2i = 1001, corresponds to h vy k 1 and h vx k 2, with sk a 1i sa k 2i sb k 1i sb k 2i = 0110 h k1 = 0100 and h k2 = 0101 with sk a 1 isk a 2 isk b 1 isk b 2 i = 1001 explains genotype g i = 0102 h k1 = 0101 and h k2 = 0100 with sk a 1 isk a 2 isk b 1 isk b 2 i = 0110 also explains genotype g i = 0102 Symmetries can be eliminated: Enforce an ordering of the Boolean valuations to the haplotypes Require h v 1 < h v 2 <... < h v r for any valuation v

27 Organization of SHIPs SHIPs implemented as a Perl script that accepts list of genotypes to explain Script iteratively generates CNF models for increasing larger numbers of haplotypes Each CNF formula is passed on to a SAT solver Can interface minisat, siege, satz Algorithm terminates by returning the smallest number of haplotypes which can explain all genotypes

28 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

29 SHIPs vs. Others on Standard Instances Instances solved within 7200 seconds CPU time: Instance #S/Rec #G RTIP Poly Hyb Hapar SHIPs Unif 10/ /15 15/15 15/15 15/15 15/15 10/ /15 15/15 15/15 15/15 15/15 10/ /15 12/15 15/15 15/15 15/ /15 11/15 15/15 15/15 15/ /50 27/50 35/50 50/50 50/ /10 4/10 6/10 9/10 10/ /10 3/10 3/10 9/10 10/10 NonUnif /15 14/15 15/15 15/15 15/ /15 8/15 15/15 15/15 15/ /15 0/15 6/15 14/15 15/ /15 0/15 5/15 6/15 15/ /15 0/15 3/15 4/15 15/15 Hapmap 30:75 7:68 0/24 15/24 15/24 17/24 23/24 TOTAL 92/ / / / /229

30 SHIPs vs. Hapar I hapar basic ships

31 SHIPs vs. Hapar II run time instances hapar basic ships

32 Outline Haplotype Inference Haplotype Inference by Pure Parsimony (HIPP) ILP Models and Variants SAT-Based HIPP Model Results Conclusions

33 Conclusions & Ongoing Research Work SAT-Based Haplotype Inference By far the most efficient solution to the HIPP problem Up to 5 orders of magnitude speedup wrt any other solution to the HIPP problem How precise is pure parsimony? SHIPs allows extensive analysis on real-world data Accuracy of HIPP is similar to the best statistical methods Optimizing the model Reducing the number of variables Additional sorting constraints, on the s variables Further uses of incompatibilities Promising preliminary results (see next slide!) Can use SAT to evaluate alternative criteria for the haplotype inference problem

34 The Near Future SHIPs Improved Model I 10 4 std instances 10 3 basic ships improved ships

35 The Near Future SHIPs Improved Model II 10 4 bio test data 10 3 basic ships improved ships

SAT in Bioinformatics: Making the Case with Haplotype Inference

SAT in Bioinformatics: Making the Case with Haplotype Inference SAT in Bioinformatics: Making the Case with Haplotype Inference Inês Lynce 1 and João Marques-Silva 2 1 IST/INESC-ID, Technical University of Lisbon, Portugal ines@sat.inesc-id.pt 2 School of Electronics

More information

Efficient Haplotype Inference with Answer Set Programming

Efficient Haplotype Inference with Answer Set Programming Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Efficient Haplotype Inference with Answer Set Programming Esra Erdem and Ferhan Türe Faculty of Engineering and Natural

More information

On Computing Backbones of Propositional Theories

On Computing Backbones of Propositional Theories On Computing Backbones of Propositional Theories Joao Marques-Silva 1 Mikoláš Janota 2 Inês Lynce 3 1 CASL/CSI, University College Dublin, Ireland 2 INESC-ID, Lisbon, Portugal 3 INESC-ID/IST, Lisbon, Portugal

More information

Improving Unsatisfiability-based Algorithms for Boolean Optimization

Improving Unsatisfiability-based Algorithms for Boolean Optimization Improving Unsatisfiability-based Algorithms for Boolean Optimization Vasco Manquinho Ruben Martins Inês Lynce IST/INESC-ID, Technical University of Lisbon, Portugal SAT 2010, Edinburgh 1 / 27 Motivation

More information

Haplotyping estimation from aligned single nucleotide polymorphism fragments has attracted increasing

Haplotyping estimation from aligned single nucleotide polymorphism fragments has attracted increasing INFORMS Journal on Computing Vol. 22, No. 2, Spring 2010, pp. 195 209 issn 1091-9856 eissn 1526-5528 10 2202 0195 informs doi 10.1287/ijoc.1090.0333 2010 INFORMS A Class Representative Model for Pure Parsimony

More information

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem?

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem? Admin Two more assignments No office hours on tomorrow NP-COMPLETE PROBLEMS Run-time analysis Tractable vs. intractable problems We ve spent a lot of time in this class putting algorithms into specific

More information

A Class Representative Model for Pure Parsimony Haplotyping

A Class Representative Model for Pure Parsimony Haplotyping A Class Representative Model for Pure Parsimony Haplotyping Daniele Catanzaro, Alessandra Godi, and Martine Labbé June 5, 2008 Abstract Haplotyping estimation from aligned Single Nucleotide Polymorphism

More information

A Pigeon-Hole Based Encoding of Cardinality Constraints 1

A Pigeon-Hole Based Encoding of Cardinality Constraints 1 A Pigeon-Hole Based Encoding of Cardinality Constraints 1 Saïd Jabbour Lakhdar Saïs Yakoub Salhi January 7, 2014 CRIL - CNRS, Université Lille Nord de France, Lens, France 1 Funded by ANR DEFIS 2009 Program,

More information

Comp487/587 - Boolean Formulas

Comp487/587 - Boolean Formulas Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested

More information

The pure parsimony haplotyping problem: overview and computational advances

The pure parsimony haplotyping problem: overview and computational advances Intl. Trans. in Op. Res. 16 (2009) 561 584 DOI: 10.1111/j.1475-3995.2009.00716.x INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH The pure parsimony haplotyping problem: overview and computational advances

More information

NP-Completeness. A language B is NP-complete iff B NP. This property means B is NP hard

NP-Completeness. A language B is NP-complete iff B NP. This property means B is NP hard NP-Completeness A language B is NP-complete iff B NP A NP A P B This property means B is NP hard 1 3SAT is NP-complete 2 Result Idea: B is known to be NP complete Use it to prove NP-Completeness of C IF

More information

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München Complexity Theory Jörg Kreiker Chair for Theoretical Computer Science Prof. Esparza TU München Summer term 2010 Lecture 6 conp Agenda conp the importance of P vs. NP vs. conp neither in P nor NP-complete:

More information

Decision Procedures An Algorithmic Point of View

Decision Procedures An Algorithmic Point of View An Algorithmic Point of View ILP References: Integer Programming / Laurence Wolsey Deciding ILPs with Branch & Bound Intro. To mathematical programming / Hillier, Lieberman Daniel Kroening and Ofer Strichman

More information

NP-COMPLETE PROBLEMS. 1. Characterizing NP. Proof

NP-COMPLETE PROBLEMS. 1. Characterizing NP. Proof T-79.5103 / Autumn 2006 NP-complete problems 1 NP-COMPLETE PROBLEMS Characterizing NP Variants of satisfiability Graph-theoretic problems Coloring problems Sets and numbers Pseudopolynomial algorithms

More information

Integer Programming in Computational Biology. D. Gusfield University of California, Davis Presented December 12, 2016.!

Integer Programming in Computational Biology. D. Gusfield University of California, Davis Presented December 12, 2016.! Integer Programming in Computational Biology D. Gusfield University of California, Davis Presented December 12, 2016. There are many important phylogeny problems that depart from simple tree models: Missing

More information

From SAT To SMT: Part 1. Vijay Ganesh MIT

From SAT To SMT: Part 1. Vijay Ganesh MIT From SAT To SMT: Part 1 Vijay Ganesh MIT Software Engineering & SMT Solvers An Indispensable Tactic for Any Strategy Formal Methods Program Analysis SE Goal: Reliable/Secure Software Automatic Testing

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 8. Satisfiability and Model Construction Davis-Putnam-Logemann-Loveland Procedure, Phase Transitions, GSAT Joschka Boedecker and Wolfram Burgard and Bernhard Nebel

More information

NP-Complete Reductions 2

NP-Complete Reductions 2 x 1 x 1 x 2 x 2 x 3 x 3 x 4 x 4 12 22 32 CS 447 11 13 21 23 31 33 Algorithms NP-Complete Reductions 2 Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline NP-Complete

More information

Allen Holder - Trinity University

Allen Holder - Trinity University Haplotyping - Trinity University Population Problems - joint with Courtney Davis, University of Utah Single Individuals - joint with John Louie, Carrol College, and Lena Sherbakov, Williams University

More information

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University Algorithms NP -Complete Problems Dong Kyue Kim Hanyang University dqkim@hanyang.ac.kr The Class P Definition 13.2 Polynomially bounded An algorithm is said to be polynomially bounded if its worst-case

More information

Efficient Approximation for Restricted Biclique Cover Problems

Efficient Approximation for Restricted Biclique Cover Problems algorithms Article Efficient Approximation for Restricted Biclique Cover Problems Alessandro Epasto 1, *, and Eli Upfal 2 ID 1 Google Research, New York, NY 10011, USA 2 Department of Computer Science,

More information

Complexity (Pre Lecture)

Complexity (Pre Lecture) Complexity (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2018 Dantam (Mines CSCI-561) Complexity (Pre Lecture) Fall 2018 1 / 70 Why? What can we always compute efficiently? What

More information

Reconstructing Chemical Reaction Networks by Solving Boolean Polynomial Systems

Reconstructing Chemical Reaction Networks by Solving Boolean Polynomial Systems Reconstructing Chemical Reaction Networks by Solving Boolean Polynomial Systems Chenqi Mou Wei Niu LMIB-School of Mathematics École Centrale Pékin and Systems Science Beihang University, Beijing 100191,

More information

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch] NP-Completeness Andreas Klappenecker [based on slides by Prof. Welch] 1 Prelude: Informal Discussion (Incidentally, we will never get very formal in this course) 2 Polynomial Time Algorithms Most of the

More information

1. Introduction Recap

1. Introduction Recap 1. Introduction Recap 1. Tractable and intractable problems polynomial-boundness: O(n k ) 2. NP-complete problems informal definition 3. Examples of P vs. NP difference may appear only slightly 4. Optimization

More information

Algorithms for Satisfiability beyond Resolution.

Algorithms for Satisfiability beyond Resolution. Algorithms for Satisfiability beyond Resolution. Maria Luisa Bonet UPC, Barcelona, Spain Oaxaca, August, 2018 Co-Authors: Sam Buss, Alexey Ignatiev, Joao Marques-Silva, Antonio Morgado. Motivation. Satisfiability

More information

Essential facts about NP-completeness:

Essential facts about NP-completeness: CMPSCI611: NP Completeness Lecture 17 Essential facts about NP-completeness: Any NP-complete problem can be solved by a simple, but exponentially slow algorithm. We don t have polynomial-time solutions

More information

Lecture 25: Cook s Theorem (1997) Steven Skiena. skiena

Lecture 25: Cook s Theorem (1997) Steven Skiena.   skiena Lecture 25: Cook s Theorem (1997) Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Prove that Hamiltonian Path is NP

More information

CSE 135: Introduction to Theory of Computation NP-completeness

CSE 135: Introduction to Theory of Computation NP-completeness CSE 135: Introduction to Theory of Computation NP-completeness Sungjin Im University of California, Merced 04-15-2014 Significance of the question if P? NP Perhaps you have heard of (some of) the following

More information

A brief introduction to Logic. (slides from

A brief introduction to Logic. (slides from A brief introduction to Logic (slides from http://www.decision-procedures.org/) 1 A Brief Introduction to Logic - Outline Propositional Logic :Syntax Propositional Logic :Semantics Satisfiability and validity

More information

Decision Procedures for Satisfiability and Validity in Propositional Logic

Decision Procedures for Satisfiability and Validity in Propositional Logic Decision Procedures for Satisfiability and Validity in Propositional Logic Meghdad Ghari Institute for Research in Fundamental Sciences (IPM) School of Mathematics-Isfahan Branch Logic Group http://math.ipm.ac.ir/isfahan/logic-group.htm

More information

On Unit-Refutation Complete Formulae with Existentially Quantified Variables

On Unit-Refutation Complete Formulae with Existentially Quantified Variables On Unit-Refutation Complete Formulae with Existentially Quantified Variables Lucas Bordeaux 1 Mikoláš Janota 2 Joao Marques-Silva 3 Pierre Marquis 4 1 Microsoft Research, Cambridge 2 INESC-ID, Lisboa 3

More information

CS 6505, Complexity and Algorithms Week 7: NP Completeness

CS 6505, Complexity and Algorithms Week 7: NP Completeness CS 6505, Complexity and Algorithms Week 7: NP Completeness Reductions We have seen some problems in P and NP, and we ve talked about space complexity. The Space Hierarchy Theorem showed us that there are

More information

CS21 Decidability and Tractability

CS21 Decidability and Tractability CS21 Decidability and Tractability Lecture 18 February 16, 2018 February 16, 2018 CS21 Lecture 18 1 Outline the complexity class NP 3-SAT is NP-complete NP-complete problems: independent set, vertex cover,

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

Introduction Algorithms Applications MINISAT. Niklas Sörensson Chalmers University of Technology and Göteborg University

Introduction Algorithms Applications MINISAT. Niklas Sörensson Chalmers University of Technology and Göteborg University SAT ALGORITHMS AND APPLICATIONS nik@cschalmersse Chalmers University of Technology and Göteborg University Empirically Successful Classical Automated Reasoning a CADE-20 Workshop 22nd - 23th July, 2005

More information

CSE 555 HW 5 SAMPLE SOLUTION. Question 1.

CSE 555 HW 5 SAMPLE SOLUTION. Question 1. CSE 555 HW 5 SAMPLE SOLUTION Question 1. Show that if L is PSPACE-complete, then L is NP-hard. Show that the converse is not true. If L is PSPACE-complete, then for all A PSPACE, A P L. We know SAT PSPACE

More information

Correctness of Dijkstra s algorithm

Correctness of Dijkstra s algorithm Correctness of Dijkstra s algorithm Invariant: When vertex u is deleted from the priority queue, d[u] is the correct length of the shortest path from the source s to vertex u. Additionally, the value d[u]

More information

CSE 3500 Algorithms and Complexity Fall 2016 Lecture 25: November 29, 2016

CSE 3500 Algorithms and Complexity Fall 2016 Lecture 25: November 29, 2016 CSE 3500 Algorithms and Complexity Fall 2016 Lecture 25: November 29, 2016 Intractable Problems There are many problems for which the best known algorithms take a very long time (e.g., exponential in some

More information

Lecture #14: NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition.

Lecture #14: NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition. Lecture #14: 0.0.1 NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition. 0.0.2 Preliminaries: Definition 1 n abstract problem Q is a binary relations on a set I of

More information

Improvements to Core-Guided Binary Search for MaxSAT

Improvements to Core-Guided Binary Search for MaxSAT Improvements to Core-Guided Binary Search for MaxSAT A.Morgado F.Heras J.Marques-Silva CASL/CSI, University College Dublin Dublin, Ireland SAT, June 2012 Outline Motivation Previous Algorithms - Overview

More information

CS Lecture 29 P, NP, and NP-Completeness. k ) for all k. Fall The class P. The class NP

CS Lecture 29 P, NP, and NP-Completeness. k ) for all k. Fall The class P. The class NP CS 301 - Lecture 29 P, NP, and NP-Completeness Fall 2008 Review Languages and Grammars Alphabets, strings, languages Regular Languages Deterministic Finite and Nondeterministic Automata Equivalence of

More information

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011 Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber 15-414 Aug 31, 2011 Why study SAT solvers? Many problems reduce to SAT. Formal verification CAD, VLSI Optimization AI, planning, automated

More information

More on NP and Reductions

More on NP and Reductions Indian Institute of Information Technology Design and Manufacturing, Kancheepuram Chennai 600 127, India An Autonomous Institute under MHRD, Govt of India http://www.iiitdm.ac.in COM 501 Advanced Data

More information

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k. Complexity Theory Problems are divided into complexity classes. Informally: So far in this course, almost all algorithms had polynomial running time, i.e., on inputs of size n, worst-case running time

More information

LOGIC PROPOSITIONAL REASONING

LOGIC PROPOSITIONAL REASONING LOGIC PROPOSITIONAL REASONING WS 2017/2018 (342.208) Armin Biere Martina Seidl biere@jku.at martina.seidl@jku.at Institute for Formal Models and Verification Johannes Kepler Universität Linz Version 2018.1

More information

A Little Logic. Propositional Logic. Satisfiability Problems. Solving Sudokus. First Order Logic. Logic Programming

A Little Logic. Propositional Logic. Satisfiability Problems. Solving Sudokus. First Order Logic. Logic Programming A Little Logic International Center for Computational Logic Technische Universität Dresden Germany Propositional Logic Satisfiability Problems Solving Sudokus First Order Logic Logic Programming A Little

More information

Propositional Logic: Evaluating the Formulas

Propositional Logic: Evaluating the Formulas Institute for Formal Models and Verification Johannes Kepler University Linz VL Logik (LVA-Nr. 342208) Winter Semester 2015/2016 Propositional Logic: Evaluating the Formulas Version 2015.2 Armin Biere

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

Undecidable Problems. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science May 12, / 65

Undecidable Problems. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science May 12, / 65 Undecidable Problems Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science May 12, 2018 1/ 65 Algorithmically Solvable Problems Let us assume we have a problem P. If there is an algorithm solving

More information

Introduction to Solving Combinatorial Problems with SAT

Introduction to Solving Combinatorial Problems with SAT Introduction to Solving Combinatorial Problems with SAT Javier Larrosa December 19, 2014 Overview of the session Review of Propositional Logic The Conjunctive Normal Form (CNF) Modeling and solving combinatorial

More information

Constraint-based Subspace Clustering

Constraint-based Subspace Clustering Constraint-based Subspace Clustering Elisa Fromont 1, Adriana Prado 2 and Céline Robardet 1 1 Université de Lyon, France 2 Universiteit Antwerpen, Belgium Thursday, April 30 Traditional Clustering Partitions

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

Lecture 9: The Splitting Method for SAT

Lecture 9: The Splitting Method for SAT Lecture 9: The Splitting Method for SAT 1 Importance of SAT Cook-Levin Theorem: SAT is NP-complete. The reason why SAT is an important problem can be summarized as below: 1. A natural NP-Complete problem.

More information

The Cook-Levin Theorem

The Cook-Levin Theorem An Exposition Sandip Sinha Anamay Chaturvedi Indian Institute of Science, Bangalore 14th November 14 Introduction Deciding a Language Let L {0, 1} be a language, and let M be a Turing machine. We say M

More information

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem. 1 More on NP In this set of lecture notes, we examine the class NP in more detail. We give a characterization of NP which justifies the guess and verify paradigm, and study the complexity of solving search

More information

i times p(p(... (p( n))...) = n ki.

i times p(p(... (p( n))...) = n ki. Chapter 7 NP Completeness Exercise 7.1 Show that an algorithm that makes at most a constant number of calls to polynomial-time subroutines runs in polynomial time, but that a polynomial number of calls

More information

An instance of SAT is defined as (X, S)

An instance of SAT is defined as (X, S) SAT: Propositional Satisfiability 22c:45 Artificial Intelligence Russell & Norvig, Ch. 7.6 Validity vs. Satisfiability Validity: A sentence is valid if it is true in every interpretation (every interpretation

More information

where X is the feasible region, i.e., the set of the feasible solutions.

where X is the feasible region, i.e., the set of the feasible solutions. 3.5 Branch and Bound Consider a generic Discrete Optimization problem (P) z = max{c(x) : x X }, where X is the feasible region, i.e., the set of the feasible solutions. Branch and Bound is a general semi-enumerative

More information

Tecniche di Verifica. Introduction to Propositional Logic

Tecniche di Verifica. Introduction to Propositional Logic Tecniche di Verifica Introduction to Propositional Logic 1 Logic A formal logic is defined by its syntax and semantics. Syntax An alphabet is a set of symbols. A finite sequence of these symbols is called

More information

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018 CS 580: Algorithm Design and Analysis Jeremiah Blocki Purdue University Spring 2018 Chapter 9 PSPACE: A Class of Problems Beyond NP Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights

More information

Cardinality Networks: a Theoretical and Empirical Study

Cardinality Networks: a Theoretical and Empirical Study Constraints manuscript No. (will be inserted by the editor) Cardinality Networks: a Theoretical and Empirical Study Roberto Asín, Robert Nieuwenhuis, Albert Oliveras, Enric Rodríguez-Carbonell Received:

More information

Estimating Recombination Rates. LRH selection test, and recombination

Estimating Recombination Rates. LRH selection test, and recombination Estimating Recombination Rates LRH selection test, and recombination Recall that LRH tests for selection by looking at frequencies of specific haplotypes. Clearly the test is dependent on the recombination

More information

Lecture 20: conp and Friends, Oracles in Complexity Theory

Lecture 20: conp and Friends, Oracles in Complexity Theory 6.045 Lecture 20: conp and Friends, Oracles in Complexity Theory 1 Definition: conp = { L L NP } What does a conp computation look like? In NP algorithms, we can use a guess instruction in pseudocode:

More information

The Eager Approach to SMT. Eager Approach to SMT

The Eager Approach to SMT. Eager Approach to SMT The Eager Approach to SMT Sanjit A. Seshia UC Berkeley Slides based on ICCAD 09 Tutorial Eager Approach to SMT Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver SAT Solver

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Design and Analysis of Algorithms CSE 5311 Lecture 25 NP Completeness Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms 1 NP-Completeness Some

More information

On Variable-Weighted 2-SAT and Dual Problems

On Variable-Weighted 2-SAT and Dual Problems SAT 2007, Lissabon, Portugal, May 28-31, 2007 On Variable-Weighted 2-SAT and Dual Problems Stefan Porschen joint work with Ewald Speckenmeyer Institut für Informatik Universität zu Köln Germany Introduction

More information

Unit 1A: Computational Complexity

Unit 1A: Computational Complexity Unit 1A: Computational Complexity Course contents: Computational complexity NP-completeness Algorithmic Paradigms Readings Chapters 3, 4, and 5 Unit 1A 1 O: Upper Bounding Function Def: f(n)= O(g(n)) if

More information

Propositional Logic. Methods & Tools for Software Engineering (MTSE) Fall Prof. Arie Gurfinkel

Propositional Logic. Methods & Tools for Software Engineering (MTSE) Fall Prof. Arie Gurfinkel Propositional Logic Methods & Tools for Software Engineering (MTSE) Fall 2017 Prof. Arie Gurfinkel References Chpater 1 of Logic for Computer Scientists http://www.springerlink.com/content/978-0-8176-4762-9/

More information

CS 320, Fall Dr. Geri Georg, Instructor 320 NP 1

CS 320, Fall Dr. Geri Georg, Instructor 320 NP 1 NP CS 320, Fall 2017 Dr. Geri Georg, Instructor georg@colostate.edu 320 NP 1 NP Complete A class of problems where: No polynomial time algorithm has been discovered No proof that one doesn t exist 320

More information

NP-Complete Problems. More reductions

NP-Complete Problems. More reductions NP-Complete Problems More reductions Definitions P: problems that can be solved in polynomial time (typically in n, size of input) on a deterministic Turing machine Any normal computer simulates a DTM

More information

NCG Group New Results and Open Problems

NCG Group New Results and Open Problems NCG Group Ne Results and Open Problems Table of Contents NP-Completeness 1 NP-Completeness 2 3 4 2 / 19 NP-Completeness 3 / 19 (P)NSP OPT - Problem Definition An instance I of (P)NSP opt : I = (G, F ),

More information

Warm-Up Problem. Is the following true or false? 1/35

Warm-Up Problem. Is the following true or false? 1/35 Warm-Up Problem Is the following true or false? 1/35 Propositional Logic: Resolution Carmen Bruni Lecture 6 Based on work by J Buss, A Gao, L Kari, A Lubiw, B Bonakdarpour, D Maftuleac, C Roberts, R Trefler,

More information

Genomes Comparision via de Bruijn graphs

Genomes Comparision via de Bruijn graphs Genomes Comparision via de Bruijn graphs Student: Ilya Minkin Advisor: Son Pham St. Petersburg Academic University June 4, 2012 1 / 19 Synteny Blocks: Algorithmic challenge Suppose that we are given two

More information

Theory of Computation Time Complexity

Theory of Computation Time Complexity Theory of Computation Time Complexity Bow-Yaw Wang Academia Sinica Spring 2012 Bow-Yaw Wang (Academia Sinica) Time Complexity Spring 2012 1 / 59 Time for Deciding a Language Let us consider A = {0 n 1

More information

Learning ancestral genetic processes using nonparametric Bayesian models

Learning ancestral genetic processes using nonparametric Bayesian models Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew

More information

Lecture 18: PCP Theorem and Hardness of Approximation I

Lecture 18: PCP Theorem and Hardness of Approximation I Lecture 18: and Hardness of Approximation I Arijit Bishnu 26.04.2010 Outline 1 Introduction to Approximation Algorithm 2 Outline 1 Introduction to Approximation Algorithm 2 Approximation Algorithm Approximation

More information

Knowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system):

Knowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system): Logic Knowledge-based agents Inference engine Knowledge base Domain-independent algorithms Domain-specific content Knowledge base (KB) = set of sentences in a formal language Declarative approach to building

More information

Lecture 17: Cook-Levin Theorem, NP-Complete Problems

Lecture 17: Cook-Levin Theorem, NP-Complete Problems 6.045 Lecture 17: Cook-Levin Theorem, NP-Complete Problems 1 Is SAT solvable in O(n) time on a multitape TM? Logic circuits of 6n gates for SAT? If yes, then not only is P=NP, but there would be a dream

More information

IS VALIANT VAZIRANI S ISOLATION PROBABILITY IMPROVABLE? Holger Dell, Valentine Kabanets, Dieter van Melkebeek, and Osamu Watanabe December 31, 2012

IS VALIANT VAZIRANI S ISOLATION PROBABILITY IMPROVABLE? Holger Dell, Valentine Kabanets, Dieter van Melkebeek, and Osamu Watanabe December 31, 2012 IS VALIANT VAZIRANI S ISOLATION PROBABILITY IMPROVABLE? Holger Dell, Valentine Kabanets, Dieter van Melkebeek, and Osamu Watanabe December 31, 2012 Abstract. The Isolation Lemma of Valiant & Vazirani (1986)

More information

via Tandem Mass Spectrometry and Propositional Satisfiability De Novo Peptide Sequencing Renato Bruni University of Perugia

via Tandem Mass Spectrometry and Propositional Satisfiability De Novo Peptide Sequencing Renato Bruni University of Perugia De Novo Peptide Sequencing via Tandem Mass Spectrometry and Propositional Satisfiability Renato Bruni bruni@diei.unipg.it or bruni@dis.uniroma1.it University of Perugia I FIMA International Conference

More information

Algorithms Design & Analysis. Approximation Algorithm

Algorithms Design & Analysis. Approximation Algorithm Algorithms Design & Analysis Approximation Algorithm Recap External memory model Merge sort Distribution sort 2 Today s Topics Hard problem Approximation algorithms Metric traveling salesman problem A

More information

Preprocessing QBF: Failed Literals and Quantified Blocked Clause Elimination

Preprocessing QBF: Failed Literals and Quantified Blocked Clause Elimination Preprocessing QBF: Failed Literals and Quantified Blocked Clause Elimination Florian Lonsing (joint work with Armin Biere and Martina Seidl) Institute for Formal Models and Verification (FMV) Johannes

More information

Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time

Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time Michael Krivelevich and Dan Vilenchik Tel-Aviv University Solving Random Satisfiable 3CNF Formulas in Expected Polynomial Time p. 1/2

More information

SATisfiability Solving: How to solve problems with SAT?

SATisfiability Solving: How to solve problems with SAT? SATisfiability Solving: How to solve problems with SAT? Ruben Martins University of Oxford February 13, 2014 How to encode a problem into SAT? c famous problem (in CNF) p cnf 6 9 1 4 0 2 5 0 3 6 0-1 -2

More information

The Complexity of Optimization Problems

The Complexity of Optimization Problems The Complexity of Optimization Problems Summary Lecture 1 - Complexity of algorithms and problems - Complexity classes: P and NP - Reducibility - Karp reducibility - Turing reducibility Uniform and logarithmic

More information

Lecture 13, Fall 04/05

Lecture 13, Fall 04/05 Lecture 13, Fall 04/05 Short review of last class NP hardness conp and conp completeness Additional reductions and NP complete problems Decision, search, and optimization problems Coping with NP completeness

More information

NP-Complete problems

NP-Complete problems NP-Complete problems NP-complete problems (NPC): A subset of NP. If any NP-complete problem can be solved in polynomial time, then every problem in NP has a polynomial time solution. NP-complete languages

More information

Integer vs. constraint programming. IP vs. CP: Language

Integer vs. constraint programming. IP vs. CP: Language Discrete Math for Bioinformatics WS 0/, by A. Bockmayr/K. Reinert,. Januar 0, 0:6 00 Integer vs. constraint programming Practical Problem Solving Model building: Language Model solving: Algorithms IP vs.

More information

Exact Max 2-SAT: Easier and Faster. Martin Fürer Shiva Prasad Kasiviswanathan Pennsylvania State University, U.S.A

Exact Max 2-SAT: Easier and Faster. Martin Fürer Shiva Prasad Kasiviswanathan Pennsylvania State University, U.S.A Exact Max 2-SAT: Easier and Faster Martin Fürer Shiva Prasad Kasiviswanathan Pennsylvania State University, U.S.A MAX 2-SAT Input: A 2-CNF fomula F with weights on clauses. Good assignment is one that

More information

Instructor N.Sadagopan Scribe: P.Renjith. Lecture- Complexity Class- P and NP

Instructor N.Sadagopan Scribe: P.Renjith. Lecture- Complexity Class- P and NP Indian Institute of Information Technology Design and Manufacturing, Kancheepuram Chennai 600 127, India An Autonomous Institute under MHRD, Govt of India http://www.iiitdm.ac.in COM 501 Advanced Data

More information

NP Completeness and Approximation Algorithms

NP Completeness and Approximation Algorithms Chapter 10 NP Completeness and Approximation Algorithms Let C() be a class of problems defined by some property. We are interested in characterizing the hardest problems in the class, so that if we can

More information

Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem

Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Lan Liu 1, Xi Chen 3, Jing Xiao 3, and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Conjunctive Normal Form and SAT

Conjunctive Normal Form and SAT Notes on Satisfiability-Based Problem Solving Conjunctive Normal Form and SAT David Mitchell mitchell@cs.sfu.ca October 4, 2015 These notes are a preliminary draft. Please use freely, but do not re-distribute

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation CISC 4090 Theory of Computation Complexity Professor Daniel Leeds dleeds@fordham.edu JMH 332 Computability Are we guaranteed to get an answer? Complexity How long do we have to wait for an answer? (Ch7)

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 26 Computational Intractability Polynomial Time Reductions Sofya Raskhodnikova S. Raskhodnikova; based on slides by A. Smith and K. Wayne L26.1 What algorithms are

More information

Shorelines of islands of tractability: Algorithms for parsimony and minimum perfect phylogeny haplotyping problems

Shorelines of islands of tractability: Algorithms for parsimony and minimum perfect phylogeny haplotyping problems 1 arxiv:q-bio/0605024v3 [q-bio.ot] 12 Jan 2007 Shorelines of islands of tractability: Algorithms for parsimony and minimum perfect phylogeny haplotyping problems Leo van Iersel, Judith Keijsper, Steven

More information

A Phylogenetic Network Construction due to Constrained Recombination

A Phylogenetic Network Construction due to Constrained Recombination A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer

More information