De Novo Protein Structure Prediction
|
|
- Poppy Russell
- 5 years ago
- Views:
Transcription
1 De Novo Protein Structure Prediction
2 Multiple Sequence Alignment A C T A T T T G G
3 Multiple Sequence Alignment A C T ACT-- A T T ATT-- T --TGG G G
4 Multiple Sequence Alignment What about the simple extension from 2D?! There are seven possible endings : u v u v m v m u u u v m w n v m w w n w n w n endings for 2 k 1 k sequences. Why?
5 Multiple Sequence Alignment s i 1,j 1,k 1 + (u i,v j,w k ) s i 1,j 1,k + (u i,v j, ) s i 1,j,k 1 + (u i,,w k ) s i,j,k = max s i,j 1,k 1 + (,v j,w k ) s i 1,j,k + (u i,, ) s i,j,k 1 + (,,w k ) s i,j 1,k + (,v j, ) s i,j,k 1 + (,,w k ) (x, y, z) is an entry in the 3D scoring matrix Time and space grow exponentially with number of sequences
6 Scoring Sum-of-Pairs Scoring (SP): S(A) = m 1 i=1 m j=i+1 S( s i, s j ) A s i S( s i, s j ) a multiple alignment projection of sequence i ( i with gaps) score of pairwise alignment Idea: A good multiple alignment should contain good pairwise alignments
7 Pruning the DP Matrix The dynamic programming matrix is large, but we only want the best alignment, and most matrix elements are not on that path.! Can we direct the search to avoid evaluating cells that are provably not on the best path? S v S v Score of the best path from start to v v F v F v Bound on the best path from to the end v K Score of best known alignment What if: S v + F v <K
8 Pruning the DP Matrix ARSTVK, ASVK, ARTR Let v = (3, 2, 2) S v F v is score of best alignment of: ARS, AS, AR is upper bound on score of aligning: TVK, VK, TR If S v + F v <K then mark v as dead-ending (aka prune v )
9 Pruning the DP Matrix We know the alignment score is: m 1 m S(A) S(s k,s l ) k=1 l=k+1 Observation: S(A) = m 1 m i=1 j=i+1 S( s i, s j ) S( s i, s j ) S(s i,s j ) So our bound can be: F v = m 1 k=1 m l=k+1 S(s k v k +1...n k,s l v l +1...n l ) Runtime for computing F (using dynamic programming): O(n 2 m 2 )
10 Backbone Native State Each point on the energy landscape defines a conformation and associated energy.! How many degrees of freedom should we have? How many do we want? A protein conformation can be represented by a vector of DOF choices, and the conformation with minimum (potential) energy is: =(...,r i,...) E( ) = X i6=j E i,j + X i E i
11 Backbone Native State Each point on the energy landscape defines a conformation and associated energy.! How many degrees of freedom should we have? How many do we want? A protein conformation can be represented by a vector of DOF choices, and the conformation with minimum (potential) energy is: =(...,r i,...) = arg min E( )
12 Primary Sequence Energy Function Conformation Space In order to apply a discrete optimization technique, we need a discretized search space!
13 Primary Sequence Energy Function Algorithm Conformation Space
14 (Homologous) Backbone Energy Function Algorithm Conformation Space
15 (Homologous) Backbone X-ray Data Algorithm Conformation Space
16 (Homologous) Backbone X-ray Data Energy Function Algorithm Conformation Space
17 (Homologous) Backbone X-ray Data Energy Function PDB Statistics Algorithm Conformation Space
18 (Homologous) Backbone NMR Data Energy Function PDB Statistics Algorithm Conformation Space
19 Prior Knowledge and Observations (Sequence/Fold, Energy Function, Statistics, Experimental Data) Conformation AlgorithmSpace Best-Fit Model (3D Structure, Backbone, Sidechains, Docking, design)
20 Prior Knowledge and Observations (Sequence/Fold, Energy Function, PDB Statistics, Experimental Data) Fast (enough)? Accurate (enough) model? Conformation AlgorithmSpace Correct (enough) solution? Best-Fit Model (3D Structure, Backbone, Sidechains, Docking, design)
21 Prior Knowledge and Observations (Sequence/Fold, Energy Function, Statistics, Experimental Data) Fast (enough)? We want O(n c ) and not O(c n ) Conformation AlgorithmSpace Correct objective function? Guarantees on solution quality? Best-Fit Model (3D Structure, Backbone, Sidechains, Docking, design)
22 Discretizing Sidechains Table 1 Published rotamer libraries. Authors Year Type of library Number of proteins in library Resolution (Å) C handrasekaran and Ramachandran [2] 1970 B BIND 3 NA Janin et al. [4] 1978 B BIND, SSDEP Bhat et al. [3] 1979 B BIND 23 NA James and Sielecki [5] 1983 B BIND 5 1.8, R-factor < 0.15 B enedetti et al. [6] 1983 B BIND 238 peptides R-factor < 0.10 Ponder and Richards [7] 1987 B BIND Mc Gregor et al. [8] 1987 SSDEP Tuffery et al. [9] 1991 B BIND Dunbrack and Karplus [10] 1993 B BIND, B B DEP Schrauber et al. [11] 1993 B BIND, S S DEP Kono and Doi [12] 1996 B BIND 103 NA De Maeyer et al. [13] 1995 B BIND Dunbrack and C ohen [14] B BIND, B B DEP 850* 1.7 Lovell et al. [15 ] 2000 B BIND, SSDEP *Latest update, May NA, not available. [Dunbrack, Rotamer Libraries in the 21st Century, 2002]
23 Energy Functions Standard approaches (e.g. Amber, CHARMM, GROMACS) model potential energy as : E total = E bonded + E unbonded where: E bonded = E bond + E angle + E dihedral E nonbonded = E electrostatic + E vdw
24
25 (Homologous) Backbone Energy Function Algorithm Conformation Space
26 R Phenylalanine ! n For a protein with amino acids, the protein backbone has 2n 2 degrees of freedom.! Sidechain conformations are also defined by dihedral angles, but can be discretized by rotamers. [Shapovalov, Dunbrack 11]
27 Native State Backbone Each point on the energy landscape defines a conformation and associated energy.! For sidechain placement, we have n degrees of freedom. Each amino acid has a number of states equal to the number of rotamers for that type. A sidechain conformation can be represented by a vector of rotamer choices, and the conformation with minimum (potential) energy is: =(...,i r,...) = arg min E( )
28 Dead End Elimination One of the only deterministic, non-trivial, and effective combinatorial optimization algorithms in Computational Structural Biology Prunes rotamers that are provably NOT part of the GMEC Used For Side-Chain Placement (tertiary structure prediction) Protein Design Original DEE
29 Dead End Elimination Total Energy 1 3 2
30 Dead End Elimination Total Energy i r i t
31 Dead End Elimination Total Energy i r i t
32 Dead End Elimination Total Energy i r i t
33 Dead End Elimination Original DEE (Simplified) i r i t? 3? 3 2? 2?
34 Dead End Elimination Original DEE (Simplified) i r min i t max?? min 3 max 3?? 2 2
35 Dead End Elimination Original DEE (Simplified) Pierce, Spriet, Desmet, Mayo, JCC, 2000
36 Dead End Elimination Original DEE: Pierce, Spriet, Desmet, Mayo, JCC, 2000
37 Dead End Elimination Goldstein Criterion: E(i r ) E(i t )+ X j6=i min s {E(i r,j s ) E(i t,j s )} > 0 Pierce, Spriet, Desmet, Mayo, JCC, 2000
38 Dead End Elimination Goldstein Criterion: E(i r ) E(i t )+ X j6=i min s {E(i r,j s ) E(i t,j s )} > 0 Pierce, Spriet, Desmet, Mayo, JCC, 2000
39 Dead End Elimination Generalized Goldstein Criterion: E(i r ) X t=1,t C t E(i t )+X j6=i {min s E(i r,j s ) X t=1,t C t E(i t,j s )} > 0 Pierce, Spriet, Desmet, Mayo, JCC, 2000
40 Conformation Space k c k a k b k d k e The idea behind bottom line DEE is that the conformation space can be partitioned to improve pruning.! If a particular rotamer can be eliminated in any partition, then it is not in the GMEC.
41 Dead End Elimination Simple Split DEE (for each partition): E(i r ) E(i t )+ X j6=k6=i min s {E(i r,j s ) E(i t,j s )} +[E(i r,k v ) E(i t,k v )] > 0 Pierce, Spriet, Desmet, Mayo, JCC, 2000
42 TABLE II. CPU Minutes Consumed Using Goldstein (T = 1) DEE, Split (s = 1) DEE, and Split (s = 2 mb )DEEforEachof Three Test Cases. Case Method (T = 1) time (s = 1) time (s = 2 mb )time Doublestime Totaltime 1 Goldstein (T = 1) a Split (s = 1) Split (s = 2 mb ) Goldstein (T = 1) a Split (s = 1) Split (s = 2 mb ) Goldstein (T = 1) a Split (s = 1) a Split (s = 2 mb ) a Failed to converge due to combinatorial explosion in the number of superrotamers created by unification. FIGURE 5. Plastocyanin core design (the two split methods are indistinguishable). FIGURE 6. Protein G core/boundary design. FIGURE 7. Protein G surface design.
43 Extensions and Results Sidechain placement vs. design, is there a difference? DEE can be an extremely powerful pruning strategy, what do we do in cases where the conformation space remains large? Can we do better than looking at conformations exhaustively?
44 Conformation Space 1 Far apart Explicit: (1,1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3) An explicit representation considers all possible conformations individually, leading to an exponentially sized conformation space.
45 Factorized Conformation Space 1 Far apart Implicit: (1, 2, 3), (1, 2, 3) Rather than defining the conformation space explicitly, by considering only local interactions we can obtain a compact, factored representation of the conformation space.
46 Factorized Conformation Space 1 i E(i, j) E(i, i 0 ) 2 3 E(i, j) E(j, j 0 ) 1 2 j 3 Implicit: (1, 2, 3), (1, 2, 3) Rather than defining the conformation space explicitly, by considering only local interactions we can obtain a compact, factored representation of the conformation space.
47 Factorized Conformation Space Rather than defining the conformation space explicitly, by considering only local interactions we can obtain a compact, factored representation of the conformation space.
48 Protein Interaction Graph We can construct an interaction graph of residues in which edges are defined for residues that are close enough. These define pairwise energy terms for any chosen pair of rotamers.
49 Protein Interaction Graph The sidechain placement problem is then to select rotamers at each position so as to minimize the sum over all edges of interaction energies.
50 Linear Programming LP Solver A, b, c Minimize X i c i x i Subject to: A x apple b x 1,x 2,...,x n Linear programming is a general-purpose tool for optimization a linear objective function under linear constraints. The general problem of Linear Programming is polynomial-time solvable.
51 Linear Programming ILP Solver A, b, c Minimize X i c i x i Subject to: A x apple b x i 2 Z x 1,x 2,...,x n Linear programming is a general-purpose tool for optimization a linear objective function under linear constraints. Integer Linear Programming is not known to be polynomial-time solvable.
52 Linear Programming ILP Solver A, b, c Minimize X i c i x i Subject to: A x apple b x i 2{0, 1} x 1,x 2,...,x n Linear programming is a general-purpose tool for optimization a linear objective function under linear constraints. Integer Linear Programming is not known to be polynomial-time solvable.
53 3 P 1 1 [Wikipedia] The feasible region of a set of constraints can be viewed as the set of all points that satisfy the constraints.! All LP solvers search the space of solutions and try to find a point that maximizes the objective function.
54 3 P 1 1 [Wikipedia] Fact: Some vertex of the feasible region is optimal. Fact: A vertex is optimal if there is no better neighboring vertex.! Dantzig (1947) came up the simplex algorithm: Set v = any vertex!! While a neighbor vertex v has better cost:!!! v = v
55 LP for Sidechain Placement { } Minimize E = u V E uux uu + {u,v} D E uvx uv subject to u V j x uu = 1 for j = 1,..., p u V j x uv = x vv for j = 1,..., p and v V \ V j x uu, x uv {0, 1}. (IP1) [Kingsford et al. 05] Integer linear programming (ILP) gives linear constraints on a set of variables, and a linear cost function.! The goal is to minimize cost (determined by variable choices) while satisfying the constraints.! ILP does not care about the energy function, or about the fact that the interaction graph comes from a protein structure.
56 LP for Sidechain Placement Minimize E = u V E uux uu + {u,v} D E uv x uv subject to x uu = 1 u V j for j = 1,..., p x uv = x vv for j = 1,..., p and v N + (V j ) u V j x uv x vv for j = 1,..., p and v N + (V j ) u V j :E uv <0 x uu, x uv {0, 1} (IP2) [Kingsford et al. 05] One simple optimization is to only include rotamer pairs that will ever interact with a non-zero pairwise energy.! These pairs can be precomputed ahead of time, and we can reduce the number of constraints.
57 LP for Sidechain Placement Minimize E = u V E uux uu + {u,v} D E uv x uv subject to x uu = 1 u V j for j = 1,..., p x uv = x vv for j = 1,..., p and v N + (V j ) u V j x uv x vv for j = 1,..., p and v N + (V j ) u V j :E uv <0 x uu, x uv {0, 1} x uu,x uv apple 1 x uu,x uv 0 (IP2) What does it mean for the integrality constraints to be relaxed?
58 LP for Sidechain Placement Minimize E = u V E uux uu + {u,v} D E uv x uv subject to x uu = 1 u V j for j = 1,..., p x uv = x vv for j = 1,..., p and v N + (V j ) u V j x uv x vv for j = 1,..., p and v N + (V j ) u V j :E uv <0 x uu, x uv {0, 1} x uu,x uv apple 1 x uu,x uv 0 (IP2) Relaxing the integrality constraints allows the application of a polynomial-time algorithm for finding an optimal solution for the given set of constraints and objective. What does it mean to have a fractional solution?
59 LP for Sidechain Placement Minimize E = u V E uux uu + {u,v} D E uv x uv subject to x uu = 1 u V j for j = 1,..., p x uv = x vv for j = 1,..., p and v N + (V j ) u V j x uv x vv for j = 1,..., p and v N + (V j ) u V j :E uv <0 x uu, x uv {0, 1} x uu,x uv apple 1 x uu,x uv 0 (IP2) u S k x uu p 1 for k = 1,..., m 1 We can also run this method iteratively, excluding previously identified minimum-energy conformations from being selected.
60 Table 3. Prediction of side-chain conformations on native backbones, with a comparison of the LP/ILP prediction with those of other methods and the crystal structure Table 4. Prediction of side-chain conformations using homology modeling, with a comparison of the LP/ILP prediction with those of other methods and the crystal structure Core residues All residues Core residues (Å) All residues (Å) (a) LP/ILP χ 1 /χ %/62% 80%/51% (b) Scwrl χ 1 /χ %/60% 80%/49% (c) LP/ILP rmsd Å Å (d) Scwrl rmsd Å Å All values are averaged over the 25 proteins of Table 1. (a) The percentage of residues over all proteins for which LP/ILP predicted conformation has the χ 1 and χ 1+2 dihedral angles within 20 of the native structure; (b) these values for Scwrl; (c) the rmsd of the predicted side-chain conformations from those of the native side chains using the LP/ILP method; and (d) these are values for Scwrl. (a) LP/ILP rmsd (b) Scwrl rmsd (c) Backbone rmsd All values are averaged over the 33 problems of Table 2. (a) The rmsd between just sidechain atoms when comparing the LP/ILP predicted structure with the crystal structure; (b) this value when comparing the Scwrl predictions with the native structure; and (c) the rmsd between template and target structures when only considering backbone atoms.
61 Table 5. Proteins for which the core was redesigned Prot. Var len Rot Size Time (ILP) Rel gap N 1aac e2 (1.3e2) aho Integral 1b9o e2 (9.4) c5e e1 Integral 1c9o e1 (4.6e1) cc e1 (2.4) cex e3 (7.0e2) cku Integral 1ctj e1 Integral 1cz e3 (3.2e2) czp e2 (1.4e2) d4t e2 (8.9e1) igd Integral 1mfm e3 (5.4e3) plc e2 (1.3e2) qj e4 (4.5e5) qq e3 (6.9e2) qtn e2 (7.0e1) qu e2 (6.4) rcf e3 (9.6e1) vfy Integral 2pth e4 (2.4e4) lzt e2 (3.9e2) p e3 (1.3e4) rsa e2 (1.4e1) Relative gap aac Instance 1aho 1aac 1ctj 1igd 1cex Fig. 2. Relative gap between the optimal solution (with value OPT) and the nine next lowest-energy solutions (where the i-th solution has value x i ). Inset shows relative gaps for the 100 lowest-energy solutions for 1aac. Relative gap at each iteration i is defined as 100( OPT x i / OPT ).
62 Factorized Conformation Space 1 i E(i, j) E(i, i 0 ) 2 3 E(i, j) E(j, j 0 ) 1 2 j 3 Implicit: (1, 2, 3), (1, 2, 3) Rather than defining the conformation space explicitly, by considering only local interactions we can obtain a compact, factored representation of the conformation space.
63 Factorized Conformation Space Rather than defining the conformation space explicitly, by considering only local interactions we can obtain a compact, factored representation of the conformation space.
64 Protein Interaction Graph We can construct an interaction graph of residues in which edges are defined for residues that are close enough. These define pairwise energy terms for any chosen pair of rotamers.
65 Factor Graphs x f 3 Suppose we know that the likelihood function is: f 2 v y f 4 MAP Configuration Find the configuration of variables that maximizes : f 1 u [Loeliger et al. 01] [Pearl 88, Jordan...] z f 5 Marginalization Find the marginal value of on a particular variable. For example: g z = X f 1 (u, v) f 2 (v, x) f 3 (x, y) f 4 (y, z) f 5 (z) u,v,x,y
66 Factor Graphs x f 3 Suppose we know that the likelihood function is: f 2 v y f 4 MAP Configuration Find the configuration of variables that maximizes : f 1 u [Loeliger et al. 01] [Pearl 88, Jordan...] z f 5 Here, variables take on a fixed number of states, and factors define local interactions.
67 Factor Graphs x f 3 Suppose we know that the likelihood function is: f 2 v f 1 y f 4 z max z f 5 (z) MAP Configuration Find the configuration of variables that maximizes : max f 4 (y, z) y max hu,v,x,y,zi g(u, v, x, y, z) = max x f 3(x, y) max f 2 (v, x) max f 1(u, v) v u u [Loeliger et al. 01] [Pearl 88, Jordan...] f 5 We can define likelihoods using the Boltzmann distribution: Pr[ ] / e E( )
68 f 3 y f 1 x f 2 f 4 f 5 z To construct a protein factor graph, we take each amino acid in the primary sequence as a variable, and its sidechains as states. Univariate and bivariate factors are defined using self- and pairwise energies (i.e., probabilities). A MAP configuration corresponds to a minimum-energy conformation. 1UBQ - Ubiquitin Boltzmann distribution: Pr[ ] / e E( ) The model (with appropriate parameters) can be used to analyze protein energetics [Yanover/Weiss 02, Xu 05, Kamisetty et al 07, 11].
69 Max-Product Algorithm x f 3 Maximization can be computed by message passing : f 2 v f 1 u y f 4 z f 5 max z µ fj!x i (x i ) = max X j \x i f j (X j ) Y x2x j \x i µ x!fj (x) Once all messages have been passed, we can assign a maximizing configuration starting at leaf factors. f 5 (z) max f 4 (y, z) max f 3(x, y) max f 2 (v, x) max f 1(u, v) y x v u [Pearl 88]
70 Dealing with Cycles f 2 v f 1 u x??? f 3 [Pearl 88, Yedidia et al ] y f 4 z f 5 Computing marginals or MAP configurations exactly in a model with cycles is NP-hard.! However, we can still use the sumproduct algorithm in two ways:! Collapse multiple variables into a single variable to eliminate cycles.! Run sum-product as before, but until convergence.! One method is exact, while the other is approximate.
71 Dealing with Cycles x f 1 u v f 2 f 3 [Pearl 88, Yedidia et al ] x y f 4 z f 5 Computing marginals or MAP configurations exactly in a model with cycles is NP-hard.! However, we can still use the sumproduct algorithm in two ways:! Collapse multiple variables into a single variable to eliminate cycles.! Run sum-product as before, but until convergence.! Variable/Factor grouping must be chosen carefully to avoid state-space explosion.
72 Unfortunately exact methods are prohibitively expensive if we consider longer-range interactions. We can approximate by stopping message passing near (or at) convergence.
73 Tree Decomposition h fh b d f g abcdefm fg m c e a i clk eij l k j Fig. 1. Example of a residue interaction graph. Fig. 2. Example of the biconnected-component decomposition of a graph. The width of this decomposition is 6. Given a factor graph, we can actually reorganize it as long as we don t lose any dependencies. But, we don t want to add too many unnecessary ones either.
74 Tree Decomposition fh abd acd cdem defm fg clk eij Fig. 3. Example of a tree decomposition of a graph with width 3. Given a factor graph, we can actually reorganize it as long as we don t lose any dependencies. But, we don t want to add too many unnecessary ones either.! In general, which trees capture the original graph, and how can we measure how good a particular tree it?
75 Tree Decomposition fh abd acd cdem defm fg clk eij Fig. 3. Example of a tree decomposition of a graph with width 3. A tree decomposition is a tree on vertex subsets that satisfies the following:! 1. The union of all vertex subsets equals the original vertex set. 2. For any edge in the original graph, there is some with. 3. If and, then for all on the path between and. X j X i (u, v) X i u, v 2 X i v 2 X i v 2 X j v 2 X k X k X i
76 Standard Applications Stereo Vision [Sontag 10] Signal Processing Coding [Söding 05] [McEliece et al. 98]
77 General Graphs a b a b c a b d c d c e f d e x f y Tree Decomposition (NP-Hard) e x f y u v u x v y Loopy Graph Junction Tree To deal with graphs with cycles, we group variables such that the original likelihood function is unchanged but we obtain a tree-structured model. If this junction-tree has treewidth, sum-product requires O(n d ) time.
78 Sum-Product is Fragile x f 3 x f 3 f 2 y f 2 y v f 1 f 4 z Update v f 1 f 4 z Updating a tree-structured factor graph can change messages in an execution of the sum-product algorithm. u f 5 u f 5 c e x a b d f y Add (u, v) c e u u e x a b c u u d a b f d u f u y Adding a cycle to the input graph can change nodes in the junction tree (and associated factor graph). u v u x v u y
79 Clustering in Factor Graphs x f 3 x Cluster Functions f 2 y f 2 f 3 y v f 4 Rake, Compress,, v f 4 f 1 z f 1 z u f 5 ū f 5 In each round of clustering, we rake all leaves and compress a maximal independent set of degree-two nodes [Miller/Reif 84], while computing cluster functions.
80 Tree Contraction x f 3 f 2 v y f 4 Rake, Compress f 2 f 1 x y z Compress f 2 y z f 1 z f 1 (u, v) = f 1 (u, v) Finalize ū f 5 f 2 (y) = f 1 (u, v) x (x, y)f 2 (v, x, y) ȳ u,v,x = f 1 (u, v)f 3 (x, y)f 2 (v, x, y) u,v,x How long do intermediate cluster function computations take? How may rounds until everything is eliminated?
81 Cluster Tree x f 3 ȳ f 2 y O(n d 3 ) time f 2 v f 4 f 1 x z f 1 z u f 5 ū v f 3 f4 f5 We also keep track of the boundaries, defined as the set of edges leaving a cluster at the time of its creation during contraction.
82 Computing Marginals Mȳ ȳ x f 3 ' x f 2 y f 2 M f2 f 1 x x z z v f 4 M f1 f 1 z v ' v ū v f 3 f4 f5 u f 5 ' z Any marginal can be computed in O(d 2 log n) time.
83 Dealing with Cycles f 2 v f 1 u x f 3 [Pearl 88, Yedidia et al ] y f 4 z f 5 Computing marginals or MAP configurations exactly in a model with cycles is NP-hard.! However, we can still use the sumproduct algorithm in two ways:! Collapse multiple variables into a single variable to eliminate cycles.! Run max-product as before, but until convergence.! The focus of research in approximate methods is in improving convergence times.
84 Message Passing and Free Energy We have been trying to minimize the potential energy of a protein conformation. But given that proteins exist in an ensemble of conformations, what do we minimize?! The free energy of a protein is defined as:!!! H G = H TS Where is the enthalpy of the system, and is the entropy of the system. S How does this relate to graphical models? We can define:!! G = X p( )E( )+T X p( )ln(p( )) =!! Here, is the normalizing constant, or partition function. Z ln Z
85 Approximate Inference Can we simplify the model in order to make it tractable? How do we do this? What can we say about the associated global likelihood? We d like to relate our approximation b( ) with the underlying global distribution p( ). The Kullback-Leibler distance between p( ) and b( ) is defined as: D(b; p) = X b( )ln b( ) p( ) Using that p( ) =e E( ) /Z we get that: D(b; p) = X which is minimized when b( )ln(b( )) + X b = p and we get: b( )E( )+lnz G = X p( )E( )+T X p( )ln(p( )) = ln Z
86 Variational Inference Now, the fit of our estimated b( ) can be measured using:! D(b; p) = X b( )ln(b( )) + X b( )E( )+lnz The variational approach to message-passing seeks to perform inference efficiently, while using bounds on ln Z to obtain a goodness of fit. Can you think of a lower bound for ln Z? An upper bound? A key area of research is to develop bounds useful for performing inference. How does all this relate back to protein structure?
87 [Lange Lab, TU-München] How does the potential-energy based view of protein design differ from the free-energy based view?
88 Free Energy Is it easy to compute the free energy of a given protein sequence (with fixed backbone)? Can we minimize the free energy for a particular choice of sequence for protein design? How can we use graphical models? Are there other (more efficient/accurate) approaches?
Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space
Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space Pablo Gainza CPS 296: Topics in Computational Structural Biology Department of Computer
More informationSide-chain positioning with integer and linear programming
Side-chain positioning with integer and linear programming Matt Labrum 1 Introduction One of the components of homology modeling and protein design is side-chain positioning (SCP) In [1], Kingsford, et
More informationBIOINFORMATICS. Solving and analyzing side-chain positioning problems using linear and integer programming
BIOINFORMATICS Vol. 00 no. 0 2004, pages 1 11 doi:10.1093/bioinformatics/bti144 Solving and analyzing side-chain positioning problems using linear and integer programming Carleton L. Kingsford, Bernard
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationCourse Notes: Topics in Computational. Structural Biology.
Course Notes: Topics in Computational Structural Biology. Bruce R. Donald June, 2010 Copyright c 2012 Contents 11 Computational Protein Design 1 11.1 Introduction.........................................
More informationWhat is Protein Design?
Protein Design What is Protein Design? Given a fixed backbone, find the optimal sequence. Given a fixed backbone and native sequence, redesign a subset of positions (e.g. in the active site). What does
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationMolecular Modeling Lecture 11 side chain modeling rotamers rotamer explorer buried cavities.
Molecular Modeling 218 Lecture 11 side chain modeling rotamers rotamer explorer buried cavities. Sidechain Rotamers Discrete approximation of the continuous space of backbone angles. Sidechain conformations
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationComputational Protein Design
11 Computational Protein Design This chapter introduces the automated protein design and experimental validation of a novel designed sequence, as described in Dahiyat and Mayo [1]. 11.1 Introduction Given
More informationTexas A&M University
Texas A&M University Electrical & Computer Engineering Department Graphical Modeling Course Project Author: Mostafa Karimi UIN: 225000309 Prof Krishna Narayanan May 10 1 Introduction Proteins are made
More informationFast and Accurate Algorithms for Protein Side-Chain Packing
Fast and Accurate Algorithms for Protein Side-Chain Packing Jinbo Xu Bonnie Berger Abstract This paper studies the protein side-chain packing problem using the tree-decomposition of a protein structure.
More informationAbstract. Introduction
In silico protein design: the implementation of Dead-End Elimination algorithm CS 273 Spring 2005: Project Report Tyrone Anderson 2, Yu Bai1 3, and Caroline E. Moore-Kochlacs 2 1 Biophysics program, 2
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationHOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.
HOMOLOGY MODELING Homology modeling, also known as comparative modeling of protein refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental
More informationFast and Accurate Algorithms for Protein Side-Chain Packing
Fast and Accurate Algorithms for Protein Side-Chain Packing JINBO XU Toyota Technological Institute at Chicago and Massachusetts Institute of Technology AND BONNIE BERGER Massachusetts Institute of Technology
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*
More informationTravelling Salesman Problem
Travelling Salesman Problem Fabio Furini November 10th, 2014 Travelling Salesman Problem 1 Outline 1 Traveling Salesman Problem Separation Travelling Salesman Problem 2 (Asymmetric) Traveling Salesman
More informationInteger Linear Programs
Lecture 2: Review, Linear Programming Relaxations Today we will talk about expressing combinatorial problems as mathematical programs, specifically Integer Linear Programs (ILPs). We then see what happens
More information- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs
LP-Duality ( Approximation Algorithms by V. Vazirani, Chapter 12) - Well-characterized problems, min-max relations, approximate certificates - LP problems in the standard form, primal and dual linear programs
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More informationTopics in Theoretical Computer Science April 08, Lecture 8
Topics in Theoretical Computer Science April 08, 204 Lecture 8 Lecturer: Ola Svensson Scribes: David Leydier and Samuel Grütter Introduction In this lecture we will introduce Linear Programming. It was
More informationA New Distributed Algorithm for Side-Chain Positioning in the Process of Protein Docking
5nd IEEE Conference on Decision and Control December 10-13, 013. Florence, Italy A New Distributed Algorithm for Side-Chain Positioning in the Process of Protein Docking Mohammad Moghadasi, Dima Kozakov,
More informationProtein Modeling. Generating, Evaluating and Refining Protein Homology Models
Protein Modeling Generating, Evaluating and Refining Protein Homology Models Troy Wymore and Kristen Messinger Biomedical Initiatives Group Pittsburgh Supercomputing Center Homology Modeling of Proteins
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationAlpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University
Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University Department of Chemical Engineering Program of Applied and
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationExamples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE
Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationDecision Procedures An Algorithmic Point of View
An Algorithmic Point of View ILP References: Integer Programming / Laurence Wolsey Deciding ILPs with Branch & Bound Intro. To mathematical programming / Hillier, Lieberman Daniel Kroening and Ofer Strichman
More informationProtein Threading. BMI/CS 776 Colin Dewey Spring 2015
Protein Threading BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2015 Goals for Lecture the key concepts to understand are the following the threading prediction task
More informationTemplate Free Protein Structure Modeling Jianlin Cheng, PhD
Template Free Protein Structure Modeling Jianlin Cheng, PhD Associate Professor Computer Science Department Informatics Institute University of Missouri, Columbia 2013 Protein Energy Landscape & Free Sampling
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationTemplate Free Protein Structure Modeling Jianlin Cheng, PhD
Template Free Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2018 Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html
More informationMolecular dynamics simulations of anti-aggregation effect of ibuprofen. Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov
Biophysical Journal, Volume 98 Supporting Material Molecular dynamics simulations of anti-aggregation effect of ibuprofen Wenling E. Chang, Takako Takeda, E. Prabhu Raman, and Dmitri Klimov Supplemental
More informationSection Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.
Section Notes 8 Integer Programming II Applied Math 121 Week of April 5, 2010 Goals for the week understand IP relaxations be able to determine the relative strength of formulations understand the branch
More informationSOLVING INTEGER LINEAR PROGRAMS. 1. Solving the LP relaxation. 2. How to deal with fractional solutions?
SOLVING INTEGER LINEAR PROGRAMS 1. Solving the LP relaxation. 2. How to deal with fractional solutions? Integer Linear Program: Example max x 1 2x 2 0.5x 3 0.2x 4 x 5 +0.6x 6 s.t. x 1 +2x 2 1 x 1 + x 2
More informationInteger Programming ISE 418. Lecture 8. Dr. Ted Ralphs
Integer Programming ISE 418 Lecture 8 Dr. Ted Ralphs ISE 418 Lecture 8 1 Reading for This Lecture Wolsey Chapter 2 Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Duality for Mixed-Integer
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationDocking. GBCB 5874: Problem Solving in GBCB
Docking Benzamidine Docking to Trypsin Relationship to Drug Design Ligand-based design QSAR Pharmacophore modeling Can be done without 3-D structure of protein Receptor/Structure-based design Molecular
More informationGraphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More informationChapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.
Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should
More informationInference as Optimization
Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact
More informationReconnect 04 Introduction to Integer Programming
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, Reconnect 04 Introduction to Integer Programming Cynthia Phillips, Sandia National Laboratories Integer programming
More informationCS 273 Prof. Serafim Batzoglou Prof. Jean-Claude Latombe Spring Lecture 12 : Energy maintenance (1) Lecturer: Prof. J.C.
CS 273 Prof. Serafim Batzoglou Prof. Jean-Claude Latombe Spring 2006 Lecture 12 : Energy maintenance (1) Lecturer: Prof. J.C. Latombe Scribe: Neda Nategh How do you update the energy function during the
More informationPart III: Traveling salesman problems
Transportation Logistics Part III: Traveling salesman problems c R.F. Hartl, S.N. Parragh 1/282 Motivation Motivation Why do we study the TSP? c R.F. Hartl, S.N. Parragh 2/282 Motivation Motivation Why
More informationLinear Programming. Scheduling problems
Linear Programming Scheduling problems Linear programming (LP) ( )., 1, for 0 min 1 1 1 1 1 11 1 1 n i x b x a x a b x a x a x c x c x z i m n mn m n n n n! = + + + + + + = Extreme points x ={x 1,,x n
More informationAssignment 2 Atomic-Level Molecular Modeling
Assignment 2 Atomic-Level Molecular Modeling CS/BIOE/CME/BIOPHYS/BIOMEDIN 279 Due: November 3, 2016 at 3:00 PM The goal of this assignment is to understand the biological and computational aspects of macromolecular
More informationMolecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror
Molecular dynamics simulation CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror 1 Outline Molecular dynamics (MD): The basic idea Equations of motion Key properties of MD simulations Sample applications
More informationLecture 23 Branch-and-Bound Algorithm. November 3, 2009
Branch-and-Bound Algorithm November 3, 2009 Outline Lecture 23 Modeling aspect: Either-Or requirement Special ILPs: Totally unimodular matrices Branch-and-Bound Algorithm Underlying idea Terminology Formal
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationProtein sidechain conformer prediction: a test of the energy function Robert J Petrella 1, Themis Lazaridis 1 and Martin Karplus 1,2
Research Paper 353 Protein sidechain conformer prediction: a test of the energy function Robert J Petrella 1, Themis Lazaridis 1 and Martin Karplus 1,2 Background: Homology modeling is an important technique
More informationLecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8
CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8 Lecture 20: LP Relaxation and Approximation Algorithms Lecturer: Yuan Zhou Scribe: Syed Mahbub Hafiz 1 Introduction When variables of constraints of an
More informationMolecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007
Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline
More informationAb-initio protein structure prediction
Ab-initio protein structure prediction Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center, Cornell University Ithaca, NY USA Methods for predicting protein structure 1. Homology
More informationContext of the project...3. What is protein design?...3. I The algorithms...3 A Dead-end elimination procedure...4. B Monte-Carlo simulation...
Laidebeure Stéphane Context of the project...3 What is protein design?...3 I The algorithms...3 A Dead-end elimination procedure...4 B Monte-Carlo simulation...5 II The model...6 A The molecular model...6
More informationDistributed Distance-Bounded Network Design Through Distributed Convex Programming
Distributed Distance-Bounded Network Design Through Distributed Convex Programming OPODIS 2017 Michael Dinitz, Yasamin Nazari Johns Hopkins University December 18, 2017 Distance Bounded Network Design
More informationRecitation 9: Loopy BP
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 204 Recitation 9: Loopy BP General Comments. In terms of implementation,
More informationInteger Linear Programming (ILP)
Integer Linear Programming (ILP) Zdeněk Hanzálek, Přemysl Šůcha hanzalek@fel.cvut.cz CTU in Prague March 8, 2017 Z. Hanzálek (CTU) Integer Linear Programming (ILP) March 8, 2017 1 / 43 Table of contents
More informationTopics in Approximation Algorithms Solution for Homework 3
Topics in Approximation Algorithms Solution for Homework 3 Problem 1 We show that any solution {U t } can be modified to satisfy U τ L τ as follows. Suppose U τ L τ, so there is a vertex v U τ but v L
More informationComputer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms
Computer Science 385 Analysis of Algorithms Siena College Spring 2011 Topic Notes: Limitations of Algorithms We conclude with a discussion of the limitations of the power of algorithms. That is, what kinds
More information1 Column Generation and the Cutting Stock Problem
1 Column Generation and the Cutting Stock Problem In the linear programming approach to the traveling salesman problem we used the cutting plane approach. The cutting plane approach is appropriate when
More informationMolecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment
Molecular Modeling 2018-- Lecture 7 Homology modeling insertions/deletions manual realignment Homology modeling also called comparative modeling Sequences that have similar sequence have similar structure.
More informationCS281A/Stat241A Lecture 19
CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723
More informationUsing Sparsity to Design Primal Heuristics for MILPs: Two Stories
for MILPs: Two Stories Santanu S. Dey Joint work with: Andres Iroume, Marco Molinaro, Domenico Salvagnin, Qianyi Wang MIP Workshop, 2017 Sparsity in real" Integer Programs (IPs) Real" IPs are sparse: The
More informationProtein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)
Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More information4 : Exact Inference: Variable Elimination
10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference
More informationKnowledge-based structure prediction of MHC class I bound peptides: a study of 23 complexes Ora Schueler-Furman 1,2, Ron Elber 2 and Hanah Margalit 1
Research Paper 549 Knowledge-based structure prediction of MHC class I bound peptides: a study of 23 complexes Ora Schueler-Furman 1,2, Ron Elber 2 and Hanah Margalit 1 Background: The binding of T-cell
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationVariational algorithms for marginal MAP
Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang
More information3.7 Cutting plane methods
3.7 Cutting plane methods Generic ILP problem min{ c t x : x X = {x Z n + : Ax b} } with m n matrix A and n 1 vector b of rationals. According to Meyer s theorem: There exists an ideal formulation: conv(x
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationExact Algorithms for Dominating Induced Matching Based on Graph Partition
Exact Algorithms for Dominating Induced Matching Based on Graph Partition Mingyu Xiao School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu 611731,
More informationMulti-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins
Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins Zhong Chen Dept. of Biochemistry and Molecular Biology University of Georgia, Athens, GA 30602 Email: zc@csbl.bmb.uga.edu
More informationProtein Structure Analysis with Sequential Monte Carlo Method. Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University
Protein Structure Analysis with Sequential Monte Carlo Method Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University Introduction Structure Function & Interaction Protein structure
More informationCS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds
CS 6347 Lecture 8 & 9 Lagrange Multipliers & Varitional Bounds General Optimization subject to: min ff 0() R nn ff ii 0, h ii = 0, ii = 1,, mm ii = 1,, pp 2 General Optimization subject to: min ff 0()
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationDual Decomposition for Inference
Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for
More informationComputational protein design
Computational protein design There are astronomically large number of amino acid sequences that needs to be considered for a protein of moderate size e.g. if mutating 10 residues, 20^10 = 10 trillion sequences
More informationApplications of Hidden Markov Models
18.417 Introduction to Computational Molecular Biology Lecture 18: November 9, 2004 Scribe: Chris Peikert Lecturer: Ross Lippert Editor: Chris Peikert Applications of Hidden Markov Models Review of Notation
More informationExtended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications
Extended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications François Vanderbeck University of Bordeaux INRIA Bordeaux-Sud-Ouest part : Defining Extended Formulations
More information11.1 Set Cover ILP formulation of set cover Deterministic rounding
CS787: Advanced Algorithms Lecture 11: Randomized Rounding, Concentration Bounds In this lecture we will see some more examples of approximation algorithms based on LP relaxations. This time we will use
More informationY1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4
Inference: Exploiting Local Structure aphne Koller Stanford University CS228 Handout #4 We have seen that N inference exploits the network structure, in particular the conditional independence and the
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationModeling Protein Conformational Ensembles: From Missing Loops to Equilibrium Fluctuations
65:164 179 (2006) Modeling Protein Conformational Ensembles: From Missing Loops to Equilibrium Fluctuations Amarda Shehu, 1 Cecilia Clementi, 2,3 * and Lydia E. Kavraki 1,3,4 * 1 Department of Computer
More information5. Sum-product algorithm
Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider
More informationIntroduction The gramicidin A (ga) channel forms by head-to-head association of two monomers at their amino termini, one from each bilayer leaflet. Th
Abstract When conductive, gramicidin monomers are linked by six hydrogen bonds. To understand the details of dissociation and how the channel transits from a state with 6H bonds to ones with 4H bonds or
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationto work with) can be solved by solving their LP relaxations with the Simplex method I Cutting plane algorithms, e.g., Gomory s fractional cutting
Summary so far z =max{c T x : Ax apple b, x 2 Z n +} I Modeling with IP (and MIP, and BIP) problems I Formulation for a discrete set that is a feasible region of an IP I Alternative formulations for the
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationBio nformatics. Lecture 23. Saad Mneimneh
Bio nformatics Lecture 23 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely
More informationDesign of a Novel Globular Protein Fold with Atomic-Level Accuracy
Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein
More informationTightness of LP Relaxations for Almost Balanced Models
Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More information