Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space

Size: px
Start display at page:

Download "Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space"

Transcription

1 Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space Pablo Gainza CPS 296: Topics in Computational Structural Biology Department of Computer Science Duke University 1

2 Outline 1) Problem definition 2) Formulation as an inference problem ) Graphical Models 4) tbmmf algorithm 5) Results 6) Conclusions 2

3 1. Problem Definition Rotamer Library Energy f(x) Positions to design & allowed Rotamers/Amino Acids Protein structure Protein design algorithm GMEC 2 S

4 1. Problem Definition (2) r! Rotamer assignment (RA) 1 r 1 2 r i! Rotamer at position i for RA r 4

5 1. Problem Definition () E i (r i )! Energy between rotamer r i and xed backbone E ij (r i ; r j )! Energy between rotamers r i and r j E i (r 1 ) E ij (r 1 ; r 2 ) 1 2 E(r)! Energy of rotamer assignment r E(r) = X i E i (r i ) + X i;j E ij (r i ; r j ) 5

6 1. Problem Definition (4) T(k)! returns amino acid type of rotamer k T(r)! returns sequence of rotamer assignment r r 1 2 T(r 1 ) = hexagon T(r 2 ) = cross T(r) = hexagon; cross 6

7 1. Problem Definition (5) Rotamer Library Energy f(x) Positions to design & allowed Rotamers/Amino Acids Protein structure Protein design algorithm GMEC 2 S S = T(arg min r E(r)) 7

8 1. Problem Definition (6) Related Work DEE / A* BroMAP BWM Exact Methods Global Minimum Energy Conformation Low energy conformation Probabilistic Methods SCMF MCSA 8

9 1. Problem Definition (7) Rotamer Library Energy f(x) Positions to design & allowed Rotamers/Amino Acids Protein structure Model Inaccurate! 9

10 1. Problem Definition (8) Protein design algorithm Algorithm Fast or provable S = T(arg min r E(r)) 10

11 1. Problem Definition (9) Too stable Not fold to target Low binding specificity Low energy conformation 11

12 1. Problem Definition (10) Provable Methods DEE/A* Ordered set of gap-free low energy conformations, including GMEC Set of low energy conformations Solution: Find a set of low energy sequences Probabilistic Methods tbmmf 12

13 Problem Definition: Summary Protein design algorithms search for the sequence with the Global Minimum Energy Conformation (GMEC). Our model is inaccurate: more than one low energy sequence is desirable. Fromer et al. Propose tbmmf to generate a set of low energy sequences. 1

14 2. Our problem as an inference problem Probabilistic factor for self-interactions à i (r i ) = e E i (r i ) T Probabilistic factor for pairwise interactions à ij (r i ; r j ) = e E ij (r i ;r j ) T 14

15 2. Inference problem (2) Partition function Z = X r e E(r) T Probability distribution for rotamer assignment P (r 1 ; :::; r N ) = 1 Z Y i à i (r i ) Y i;j à ij (r i ; r j ) = 1 Z e E(r) T r 15

16 2. Inference problem () Minimization goal (from definition) S = T(arg min r E(r)) Minimization goal for a graphical model problem S = T(arg max r P r(r)) 16

17 2. Inference problem (4) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r 0 1 Example: Inference problem 2 E i (r 1 ) E i (r 2 ) r E(r 0 ) =? E(r 00 ) =? What is our GMEC?? 17

18 2. Inference problem (5) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) r E(r 0 ) = ( 1 + 2) + ( 5 + 2) = 10 E(r 00 ) = ( 1 + 4) + ( + 4) = 12 r 00 is our GMEC 18

19 2. Inference problem (6) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) T = 1 (for our example) r 00 à i (r1 0 ) = e E i (r T 0 1 ) = e à i (r2) 0 = e E i (r 0 2 ) T = e 5 à i (r 00 1 E i(r ) = e T à i (r 00 2 ) = e 00 1 ) = e E 00 i(r2 ) T = e 19

20 2. Inference problem (7) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) r à ij (r 0 1 ; r0 2 ) = e E ij (r0 1 ;r0 2 ) T = e 2 à ij (r1 00 ; r2 00 ) = e E ij (r00 1 ;r00 2 ) T = e 4 T = 1 (for our example) Z = X r e E(r) T = e 10 + e 12 20

21 2. Inference problem (8) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) r P (r 0 1; r 0 2) = 1 Z Y à i (r 0 i) Y i;j à ij (r 0 i; r 0 j) - = e 10 e 10 + e 12 i T = 1 (for our example) 21

22 2. Inference problem (9) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) r P (r 00 1 ; r 00 2 ) = 1 Z = e 12 e 10 + e 12 Y i à i (r 0 i) Y i;j à ij (r 00 i ; r 00 j ) T = 1 (for our example) 22

23 2. Inference problem (10) Position #2 E ij (r 1 ; r 2 ) Position #1 Allowed -4-2 r E i (r 1 ) E i (r 2 ) r S = T(arg max r P r(r)) S = T(r 00 ) T = 1 (for our example) 2

24 2. Inference problem (11) Minimization goal (from definition) S = T(arg min r E(r)) Minimization goal for a graphical model problem S = T(arg max r P r(r)) We still have a non-polynomial problem! But formulated as an inference problem Probabilistic methods 24

25 Summary: Inference problem We model our problem as an inference problem. We can use probabilistic methods to solve it. 25

26 . Graphical models for protein design and belief propagation (BP) 1. Model each design position as a random variable 2. Build interaction graph that shows conditional independence between variables Source: Fromer M, Yanover, C. Proteins (2008) SspB dimer interface: Inter-monomeric interactions (Cα) 26

27 . Graphical Models/BP (2) Example: Belief propagation 2 r 0 r 00 2 node in the graphical model: interacting residue in the structure. 1 1 node in the graphical model: random variable 27

28 . Graphical Models/BP () Example: Belief propagation b r 0 r edge: energy interaction between two residues. edge: causal relationship between two nodes If two residues are distant from each other, no edge between them. 28

29 . Graphical Models/BP (4) Example: Belief propagation r 0 r r 0 2 r 0 2 r 00 r r 00 1 r 0 1 Every random variable can be in one of several states: allowable rotamers for that position 29

30 . Graphical Models/BP (5) Example: Belief propagation r 0 r r 0 2 r 0 2 r 00 r r 00 1 r 0 1 The energy of each state depends on: - its singleton energy - its pairwise energies - the energies of the states of its parents 0

31 . Graphical Models/BP (6) Example: Belief propagation m 2! (r 0 ) m 2! (r 00 ) r 0 m 2! 2 r 0 2 Belief propagation: each node tells its neighbors nodes what it believes their state should be r 00 1 A message is sent from node i to node j r 0 1 The message is a vector where # of dimensions: allowed states/rotamers in recipient 1

32 . Graphical Models/BP (7) Example: Belief propagation r 0 2 Who sends the first message? r 0 2 r 00 1 r 0 1 2

33 . Graphical Models/BP (8) Example: Belief propagation r 0 2 Who sends the first message? r 0 2 In a tree: the leaves - Belief propagation is proven to be correct in a tree! r 00 1 r 0 1

34 . Graphical Models/BP (9) Example: Belief propagation r 0 2 Who sends the first message? r 0 2 In a graph with cycles: Set initial values Send in parallel r 00 1 r 0 1 No guarantees can be made! There might not be any convergence 4

35 . Graphical Models/BP (10) Example: Belief propagation m 2! (r 0 ) = 1 m 2! (r 00 ) = 1 m!2 (r 0 2) = 1 r 0 2 We iterate from there. r 0 r 00 m 1! m!1 m 2! m!2 m1!2 2 1 m2!1 m 1!2 (r 0 2) = 1 m 2!1 (r 0 1) = 1 m!1 (r 0 1) = 1 r 0 1 m 1! (r 0 ) = 1 m 1! (r 00 ) = 1 5

36 . Graphical Models/BP (11) Example: Belief propagation m 2! (r 0 ) = 1 m 2! (r 00 ) = 1 Node receives messages from nodes 1 and 2 r 0 m 2! 2 r 0 2 r 00 m 1! 1 r 0 1 m 1! (r 0 ) = 1 m 1! (r 00 ) = 1 6

37 . Graphical Models/BP (12) Example: Belief propagation What message does node send to node 1 on the next iteration? r 0 2 r 0 2 m!1 r 00 1 m!1 (r 0 1) =? r 0 1 7

38 Belief propagation: message passing N(i)! Neighbors of variable i Message that gets sent on each iteration m i!j (r j ) = max:e Ei(ri) Eij (ri;rj ) t r i Y k2n(i)nj m k!i (r i ); 8

39 Pairwise energies E ij (r 1 ; r 2 ) Position #1 E i (r 1 ) Example: Belief propagation Singleton energies E i (r 2 ) E i (r ) Position # r 0 r 00 Position #2 E ij (r 2 ; r ) Position # -1 Position # - r 0 r 00 E ij (r 1 ; r ) Iteration 0: m!1 (r1 0 ) = max :e Ei(r) Eij (r 0 ;r 1 ) t m 2! (r 0 ); r =? e E i (r ) E ij (r 00 ;r 1 ) t m 2! (r 00 ); Position # r 0 r 00 9

40 . Graphical Models/BP (15) Example: Belief propagation Once it converges we can compute the belief each node has about itself r 0 m 2! 2 r 0 2 Belief about one's state: Multiply all incoming messages by singleton energy r 00 m 1! 1 r

41 Belief propagation: Max-marginals Belief about each rotamer MM i (r i ) = e E i (r i ) t Y k2n(i) m k!i (r i ) Pr 1 i (r i ) = max r 0 :r 0 i =r i Pr(r 0 ) Most likely rotamer for position i r i = arg max Pr 1 i (r i ) r i 2Rots i 41

42 . Graphical Models/BP (17) Fromer M, Yanover, C. Proteins (2008) Fromer M, Yanover, C. Proteins (2008) 42

43 . Graphical Models/BP (18) Fromer M, Yanover, C. Proteins (2008) Fromer M, Yanover, C. Proteins (2008) 4

44 . Graphical Models: Summary Formulate as an inference problem Model our design problem as a graphical model Establish edges between interacting residues Use Belief Propagation to find the beliefs for each position 44

45 4. tbmmf: type specific BMMF Paper's main contribution Builds on previous work by C. Yanover (2004) Uses Belief propagation to find lowest energy sequence and constrains space to find subsequent sequences 45

46 TBMMF (simplification) 1. Find the lowest energy sequence using BP 2. Find the next lowest energy sequence while excluding amino acids from the previous one. Partition into two subspaces using constraints according to the next lowest energy sequence 46

47 4. tbmmf () Example: tbmmf (1) Fromer M, Yanover, C. Proteins (2008) 47

48 4. tbmmf (4) Example: tbmmf (2) Fromer M, Yanover, C. Proteins (2008) 48

49 Results Fromer M, Yanover, C. Proteins (2008) 49

50 Results (2) Fromer M, Yanover, C. Proteins (2008) 50

51 Results() Algorithms tried: DEE / A* (Goldstein, 1-split, 2-split, Magic Bullet) tbmmf Ros: Rosetta SA: Simulated annealing over sequence space 51

52 Results (4): Assessment results Fromer M, Yanover, C. Proteins (2008) 52

53 Results(5) Fromer M, Yanover, C. Proteins (2008) 5

54 Results(6) Fromer M, Yanover, C. Proteins (2008) 54

55 Results(6) Fromer M, Yanover, C. Proteins (2008) 55

56 Results (7) DEE/A* was not feasible for any case except the prion SspB: A* could only output one sequence DEE also did not finish after 12 days BD/K* did not finish after 12 days 56

57 Results (8) Predicted sequences where highly similar between themselves. (high sequence identity) Very different from wild type sequence Solution: grouped tbmmf: apply constraints to whole groups of amino acids proof of concept only 57

58 Conclusions Fast and accurate algorithm Outperforms all other algorithms: A* is not feasible Better accuracy than other probabilistic algorithms 58

59 Conclusions (2) tbmmf produces a large set of very similar low energy results. This might be due to the many inaccuracies in the model Grouped tbmmf can produce a diverse set of low energy sequences 59

60 Conclusions () The results lack experimental data for validation. 60

61 Related Work: (Fromer et al. 2008) Fromer F, Yanover C. A computational framework to empower probabilistic protein design. ISMB 2008 Phage display: randomized protein sequences Simultaneously tested for relevant biological function 61

62 Related Work: (Fromer et al. 2008) Fromer M, Yanover, C. Bioinformatics (2008) 62

63 Related Work: (Fromer et al. 2008) Uses sum-product instead of max-product Obtain per-position amino acid probabilities Tried until convergence or iterations; all structures converged 6

64 Related Work: (Fromer et al. 2008) Conclusions: Model results in probability distributions far from those observed experimentally. Limitations of the model: Imprecise energy function Decomposition into pairwise energy terms Assumption of a fixed backbone Discretization of side chain conformations 64

65 tbmmf algorithm 65

Texas A&M University

Texas A&M University Texas A&M University Electrical & Computer Engineering Department Graphical Modeling Course Project Author: Mostafa Karimi UIN: 225000309 Prof Krishna Narayanan May 10 1 Introduction Proteins are made

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

Lecture 18 Generalized Belief Propagation and Free Energy Approximations Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Copyright 2000 N. AYDIN. All rights reserved. 1

Copyright 2000 N. AYDIN. All rights reserved. 1 Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment

More information

Course Notes: Topics in Computational. Structural Biology.

Course Notes: Topics in Computational. Structural Biology. Course Notes: Topics in Computational Structural Biology. Bruce R. Donald June, 2010 Copyright c 2012 Contents 11 Computational Protein Design 1 11.1 Introduction.........................................

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%

More information

De Novo Protein Structure Prediction

De Novo Protein Structure Prediction De Novo Protein Structure Prediction Multiple Sequence Alignment A C T A T T T G G Multiple Sequence Alignment A C T ACT-- A T T ATT-- T --TGG G G Multiple Sequence Alignment What about the simple extension

More information

Bayesian networks: approximate inference

Bayesian networks: approximate inference Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008 Approximative inference September 2008 1 / 25 Motivation Because of the (worst-case) intractability of exact

More information

Molecular Modeling Lecture 11 side chain modeling rotamers rotamer explorer buried cavities.

Molecular Modeling Lecture 11 side chain modeling rotamers rotamer explorer buried cavities. Molecular Modeling 218 Lecture 11 side chain modeling rotamers rotamer explorer buried cavities. Sidechain Rotamers Discrete approximation of the continuous space of backbone angles. Sidechain conformations

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,

More information

Loopy Belief Propagation for Bipartite Maximum Weight b-matching

Loopy Belief Propagation for Bipartite Maximum Weight b-matching Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.

More information

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Department of Chemical Engineering Program of Applied and

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 18 The Junction Tree Algorithm Collect & Distribute Algorithmic Complexity ArgMax Junction Tree Algorithm Review: Junction Tree Algorithm end message

More information

Side-chain positioning with integer and linear programming

Side-chain positioning with integer and linear programming Side-chain positioning with integer and linear programming Matt Labrum 1 Introduction One of the components of homology modeling and protein design is side-chain positioning (SCP) In [1], Kingsford, et

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Probability Propagation in Singly Connected Networks

Probability Propagation in Singly Connected Networks Lecture 4 Probability Propagation in Singly Connected Networks Intelligent Data Analysis and Probabilistic Inference Lecture 4 Slide 1 Probability Propagation We will now develop a general probability

More information

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11

More information

Context of the project...3. What is protein design?...3. I The algorithms...3 A Dead-end elimination procedure...4. B Monte-Carlo simulation...

Context of the project...3. What is protein design?...3. I The algorithms...3 A Dead-end elimination procedure...4. B Monte-Carlo simulation... Laidebeure Stéphane Context of the project...3 What is protein design?...3 I The algorithms...3 A Dead-end elimination procedure...4 B Monte-Carlo simulation...5 II The model...6 A The molecular model...6

More information

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix) Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei COMPSCI 650 Applied Information Theory Apr 5, 2016 Lecture 18 Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei 1 Correcting Errors in Linear Codes Suppose someone is to send

More information

What is Protein Design?

What is Protein Design? Protein Design What is Protein Design? Given a fixed backbone, find the optimal sequence. Given a fixed backbone and native sequence, redesign a subset of positions (e.g. in the active site). What does

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 24. Hidden Markov Models & message passing Looking back Representation of joint distributions Conditional/marginal independence

More information

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018 Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information

11 The Max-Product Algorithm

11 The Max-Product Algorithm Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

Computational Protein Design

Computational Protein Design 11 Computational Protein Design This chapter introduces the automated protein design and experimental validation of a novel designed sequence, as described in Dahiyat and Mayo [1]. 11.1 Introduction Given

More information

Lecture 4 : Introduction to Low-density Parity-check Codes

Lecture 4 : Introduction to Low-density Parity-check Codes Lecture 4 : Introduction to Low-density Parity-check Codes LDPC codes are a class of linear block codes with implementable decoders, which provide near-capacity performance. History: 1. LDPC codes were

More information

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007 Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

17 Variational Inference

17 Variational Inference Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Message Passing Algorithms and Junction Tree Algorithms

Message Passing Algorithms and Junction Tree Algorithms Message Passing lgorithms and Junction Tree lgorithms Le Song Machine Learning II: dvanced Topics S 8803ML, Spring 2012 Inference in raphical Models eneral form of the inference problem P X 1,, X n Ψ(

More information

Appendix of Computational Protein Design Using AND/OR Branch and Bound Search

Appendix of Computational Protein Design Using AND/OR Branch and Bound Search Appendix of Computational Protein Design Using AND/OR Branch and Bound Search Yichao Zhou 1, Yuexin Wu 1, and Jianyang Zeng 1,2, 1 Institute for Interdisciplinary Information Sciences, Tsinghua University,

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Stat 521A Lecture 18 1

Stat 521A Lecture 18 1 Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Bio nformatics. Lecture 23. Saad Mneimneh

Bio nformatics. Lecture 23. Saad Mneimneh Bio nformatics Lecture 23 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

5. Sum-product algorithm

5. Sum-product algorithm Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider

More information

Lecture 8: Bayesian Networks

Lecture 8: Bayesian Networks Lecture 8: Bayesian Networks Bayesian Networks Inference in Bayesian Networks COMP-652 and ECSE 608, Lecture 8 - January 31, 2017 1 Bayes nets P(E) E=1 E=0 0.005 0.995 E B P(B) B=1 B=0 0.01 0.99 E=0 E=1

More information

Dynamic Approaches: The Hidden Markov Model

Dynamic Approaches: The Hidden Markov Model Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2018 Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html

More information

Message passing and approximate message passing

Message passing and approximate message passing Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall Chapter 8 Cluster Graph & elief ropagation robabilistic Graphical Models 2016 Fall Outlines Variable Elimination 消元法 imple case: linear chain ayesian networks VE in complex graphs Inferences in HMMs and

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Fast and Accurate Algorithms for Protein Side-Chain Packing

Fast and Accurate Algorithms for Protein Side-Chain Packing Fast and Accurate Algorithms for Protein Side-Chain Packing Jinbo Xu Bonnie Berger Abstract This paper studies the protein side-chain packing problem using the tree-decomposition of a protein structure.

More information

CS281A/Stat241A Lecture 19

CS281A/Stat241A Lecture 19 CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Efficient Partition Function Estimatation in Computational Protein Design: Probabalistic Guarantees and Characterization of a Novel Algorithm

Efficient Partition Function Estimatation in Computational Protein Design: Probabalistic Guarantees and Characterization of a Novel Algorithm Efficient Partition Function Estimatation in Computational Protein Design: Probabalistic Guarantees and Characterization of a Novel Algorithm By: Hunter Nisonoff Advisers: Dr. Mauro Maggioni Professor

More information

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Max$Sum(( Exact(Inference(

Max$Sum(( Exact(Inference( Probabilis7c( Graphical( Models( Inference( MAP( Max$Sum(( Exact(Inference( Product Summation a 1 b 1 8 a 1 b 2 1 a 2 b 1 0.5 a 2 b 2 2 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 Max-Sum Elimination in Chains

More information

A Generalized Loop Correction Method for Approximate Inference in Graphical Models

A Generalized Loop Correction Method for Approximate Inference in Graphical Models for Approximate Inference in Graphical Models Siamak Ravanbakhsh mravanba@ualberta.ca Chun-Nam Yu chunnam@ualberta.ca Russell Greiner rgreiner@ualberta.ca Department of Computing Science, University of

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Jamie Duke 1,2 and Carlos Camacho 3 1 Bioengineering and Bioinformatics Summer Institute,

More information

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Prediction, Engineering & Design CHEM 430 Protein Structure Prediction, Engineering & Design CHEM 430 Eero Saarinen The free energy surface of a protein Protein Structure Prediction & Design Full Protein Structure from Sequence - High Alignment

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

On the Choice of Regions for Generalized Belief Propagation

On the Choice of Regions for Generalized Belief Propagation UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425

More information

5. Density evolution. Density evolution 5-1

5. Density evolution. Density evolution 5-1 5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Polynomial Linear Programming with Gaussian Belief Propagation

Polynomial Linear Programming with Gaussian Belief Propagation Polynomial Linear Programming with Gaussian Belief Propagation Danny Bickson, Yoav Tock IBM Haifa Research Lab Mount Carmel Haifa 31905, Israel Email: {dannybi,tock}@il.ibm.com Ori Shental Center for Magnetic

More information