FlexSADRA: Flexible Structural Alignment using a Dimensionality Reduction Approach

Size: px
Start display at page:

Download "FlexSADRA: Flexible Structural Alignment using a Dimensionality Reduction Approach"

Transcription

1 FlexSADRA: Flexible Structural Alignment using a Dimensionality Reduction Approach Shirley Hui and Forbes J. Burkowski University of Waterloo, 200 University Avenue W., Waterloo, Canada ABSTRACT A topic of research that is frequently studied in Structural Biology is the problem of determining the degree of similarity between two protein structures. The most common solution is to perform a three dimensional structural alignment of the two structures. Rigid structural alignment algorithms have been developed in the past to accomplish this but treat the protein molecules as immutable structures. Since protein structures can bend and flex, algorithms dealing with rigid structures do not yield accurate results. As an attempt to improve similarity studies, flexible structural alignment algorithms have been developed. The challenge for these algorithms is that the protein structures are represented using thousands of atomic coordinate variables. This results in a great computational burden due to the large number of degrees of freedom required to account for the flexibility. Past research in dimensionality reduction techniques has shown that a linear dimensionality reduction technique called Principal Component Analysis (PCA) is well suited for high dimensionality reduction. This paper introduces a new flexible structural alignment algorithm called FlexSADRA, which uses PCA to perform flexible structural alignments. Test results show that FlexSADRA determines better alignments than rigid structural alignment algorithms. Unlike existing rigid and flexible algorithms, FlexSADRA addresses the problem in a significantly lower dimensional problem space and assesses not only the structural fit but also the structural feasibility of the final alignment. Keywords: flexible protein structural alignment, dimensionality reduction, principal component analysis 1. INTRODUCTION Proteins are molecules made up of a string of amino acids folded into simple to complex three dimensional structures. The structure of a protein enables it to fulfill its biological function. This is evident in the case of the calcium binding molecules: Calmodulin and Calbindin. These molecules are dumbbell-shaped with the end lobes attached by a flexible linker region. 1 Although the amino acid sequence similarity between these proteins is quite low, they are both similar in structure. It is the shape and flexibility of these molecules that allow them to wrap around calcium ions to fulfill their biological functions. Therefore, determining the structural similarity between two proteins has become an important research area in Structural Biology. In this paper, if two proteins have amino acid sequences that are very different then the two structures are completely different protiens and we are not interested in attempting to measure their similarity. The more interesting situation is characterized by two amino acid sequences that share some sequence similarity perhaps due to an evolutionary descent from a common ancestor. A protein structure may be represented by a set atomic coordinates representing the protein conformation in three dimensional space. The coordinates are the x, y, z positions of each atom in the protein concatenated into a long n dimensional vector called a protein conformation vector. We call the set of all protein conformation vectors representing a sampling of the proteins flexibility motion a protein conformational vector set. If there are n vectors in the set, the flexibility motion may be represented by an n m dimensional matrix. A rigid structural alignment maybe determined between two proteins by aligning their conformation vector representations. This is usually accomplished by finding a rotation and translation of one structure to another in order to minimize the distance between the two. The root mean squared deviation (RMSD) is used to calculate this distance and measure the degree of similarity between the two structures. A small RMSD value means that the two proteins are structurally very similar. Rigid alignment algorithms do assume that proteins are static structures. It is often the case that a better structural alignment may be achieved if protein flexibility is Further author information: Send correspondence to Shirley Hui: shirleyhui@alumni.uwaterloo.ca

2 considered. This is the case in situations where one protein is hinge bent with respect to the other. Existing flexible structural alignment algorithms model protein flexibility using the atomic coordinates of one of the protein structures and treat the other protein as a rigid structure. Finding an alignment that minimizes the RMSD involves searching all possible conformations in the protein conformational vector set. This involves searching over all the different positions for the proteins atomic coordinates. Even small proteins may consist of several thousand atoms. The computational burden required to model protein flexibility in structural alignment algorithms quickly becomes enormous. This paper introduces a novel algorithm called FlexSADRA (Flexible Structural Alignment using a Dimensionality Reduction Approach). FlexSADRA uses a dimensionality reduction approach to perform flexible structural alignments using significantly fewer degrees of freedom than existing alignment algorithms. This algorithm not only assesses the structural fit but also the structural feasibility of the final alignment. 2. BACKGROUND It is often the case that high dimensional data is represented using many more degrees of freedom than is actually necessary. The goal of dimensionality reduction techniques is to find a mapping for the data from its high dimensional space to a lower dimensional space with minimal information loss. A popular linear dimensionality reduction technique is Principal Component Analysis (PCA). 2, 3 This technique is commonly used since it is fast and straightforward to implement. In the past, it has been applied to many different areas of research including face recognition systems, and protein flexibility modeling. 1, 4 Dimensionality reduction using PCA is achieved by transforming the original variables describing the data, to a set of new variables called principal components. The first principal component contributes the most variation in the data, while successive components specify lesser amounts. Dimensionality reduction is achieved by using the d components that explain the largest amount of variance in the original data. In algorithms for face recognition applications, there is typically a training set of faces that is used to find a set of principal components that describe a lower dimensional feature space called a face space. 1, 5 Each principal component represents a characteristic feature from the faces in the training set. Therefore, a face can be composed of a combination of the principal component in the right proportions. If a new face enters the system, it is projected onto the face space to determine a lower dimensional representation of the new face using the calculated principal component. The new face can be determined to be similar to an existing face or not a face at all. Principal Component Analysis has also been applied to model protein flexibility motion. 4, 6 The principal components describe the proteins flexibility and are combined in different proportions to reconstruct the original conformations in varying degrees of accuracy. As a result, only the first few principal components that retain a certain amount of the original flexibility are used to model the flexibility. The result is an approximate description of the protein flexibility motion, using only a few degrees of freedom. Our dimensionality reduction approach to the flexible structural similarity problem is similar to the approach used in face recognition systems and protein flexibility modeling. Instead of faces, we use the protein conformations representing the proteins flexibility to determine a lower dimensional space. This space describes the original proteins flexibility using only a few principal components rather than thousands of atomic coordinates. A rigid protein structure is projected onto this lower dimensional space. The projection can be mapped back to high dimensional space to determine a novel flexed protein structure that is as close to the rigid protein structure as possible. 3. DATA The protein structures used in the FlexSADRA algorithm experiments are obtained from the Protein Data Bank (PDB). 7 Only the backbone atomic coordinates and no side chains are used. The protein conformational data have been generated through experimental methods such as X-Ray Crystallography or NMR Spectroscopy, however these methods are time-consuming, expensive and do not generate many structures. As a result, a molecular dynamics (MD) simulation software application called NAMD (Not Another Molecular Dynamics) 8, 9 is used in FlexSADRA to generate the conformational data sets for the flexible protein. MD simulations generate protein conformations based on first principle calculations and although they are less accurate, they are faster and more affordable than experimental methods.

3 4. METHOD Principal components can be computed in various ways, but the eigenvector decomposition method of Singular Value Decomposition (SVD) is commonly used. If the protein conformational vector set called S is an n m matrix, where n m, then X has a singular value decomposition as follows: X = UEV T (1) The matrix U is an n n column-orthogonal matrix, V is an m m column-orthogonal matrix, E is an n m matrix whose off-diagonal entries are all 0s. The diagonal entries of E are the singular values σ 1, σ 2,..., σ n of X, where σ 1 > σ 2 >... > σ n. Since there will be many small singular values, the original matrix may be approximated with good accuracy using only d columns of U and V and only d singular values as follows: X d = U d E d V d T (2) A lower dimensional representation of the high dimensional protein structure is determined by using the following projection operation: y = U d T X (3) The original data may be approximately reconstructed using the following operation: ˆx = (U d T ) + y (4) where (U T d ) + is known as the pseudoinverse of U T d. 3 The matrix U is a transformation matrix that maps points that are close to each other in high dimensions to points that are close to each other in lower dimensions. If U is determined according to the flexible protein structure data, it defines a proteins flexibility in a lower dimensional space. For the flexible protein structure similarity problem we wish to determine the degree to which two protein structures are similar. Instead of working with high dimensional atomic coordinates, a flexed protein structure that is very close to the rigid protein can be determined more simply by projecting the rigid protein into the lower dimensional space of the flexible protein. When this lower dimensional structure is transformed back to high dimensions, this yields a flexed protein conformation that is as close to the rigid protein as possible. 5. FLEXSADRA ALGORITHM The input for the algorithm are the rigid and flexible protein structures. The flexible protein is represented by a protein conformational vector set, while the rigid structure is represented by a protein conformation vector of atomic coordinates. Assessment of the flexible alignments is based on two criteria: how close the two structures are aligned and how feasible the flexed structure is. The closeness of fit is determined by calculating the RMSD value. Since the alignment relies on the flexible protein structure flexing in a specific way, the feasibility of fit is determined by calculating the conformational energy of the flexed structure Algorithm Given: x r : vector of n atomic coordinates representing the rigid protein structure X: n m protein conformational vector set consisting of the atomic coordinates describing the flexible protein structure d: target dimensionality The pseudoinverse of a matrix U is calculated by U + = (U T U) 1 U T

4 Step One: Reduce the dimensionality of X to determine a lower dimensional space represented by U d. Determine U d by applying PCA to X using SVD to obtain d principal components that are the columns of the matrix U d. Step Two: Find y, the representation of x r in the lower dimensional space according to equation 3. Step Three: Transform y back to high dimensional space according to equation 4 to obtain a novel flexed conformation x f that is as close as possible to x r. Step Four: Assess the quality of the alignment achieved by calculating the RMSD between x f and x r and calculating the conformational energy score for x f. 6. CONSIDERATIONS Since the focus of this algorithm is protein flexibility, any overall structural rotational or translational movements 4, 10 are removed from the data set following the procedure used in the past. The result is that the data will only depict the flexing degrees of freedom of the protein. Finding the mean structure in the data set and then rigidly aligning each structure to the mean structure accomplishes this. 11 An initial structural alignment must also be performed in order to determine equivalence between residues in the two structures to be aligned. There are numerous alignment algorithms that exist to align protein sequences. Obtaining an initial alignment is a common step as a starting point in many alignment algorithms. 12 The FlexSADRA algorithm only flexes the sections of the protein molecule corresponding to aligned residues between the flexible and rigid proteins. In some cases, the flexed structure is a projection that is far away from the other points in the lower dimensional flexibility space. These conformations usually have high energy values corresponding to very unlikely conformations. In situations like these, a new flexed conformation must be found with a better energy value. A minimization activity done by a molecular dynamics simulation may be performed in order to find a more energetically favourable conformation that is as close to the original structure as possible. Finally, a requirement of our algorithm is that the structures being flexed must be of the same length. This is because the projection step of the algorithm can only be performed on objects with the same number of degrees of freedom. Therefore, only the aligned parts of the structures as determined by the initial alignment can be flexed. If the structures have different lengths, the final flexed structure may have gap sections where the atomic coordinates are missing. This is not desirable since the location of atoms in the gap areas must be known in order for the potential energy to be calculated. In addition, a complete flexed structure is more practical for further analysis than fragments of aligned sections. This problem does not limit the approach to structures of 13, 14 the same length however, since the missing coordinates may be estimated using a variety of methods. 7. RESULTS The FlexSADRA algorithm was applied to five different pairs of proteins and fully documented in previous work. 14 However, due to page restrictions, only the results for two pairs of molecules are discussed in this paper. The molecules were aligned and compared to alignments obtained by a rigid structural alignment 11 and the results from another flexible structural alignment algorithm called FlexProt. 15 Homeodomain Protein / Paired Domain Protein (6PAX-1PDN) The 6PAX protein is a member of the homeodomain family of proteins. These proteins are transcription regulators that bind to specific DNA sequences of other genes to regulate their expression and to induce cell development and differentiation. The paired box 1PDN molecule is homologous to the 6PAX homeodomain proteins. Both proteins consist of two ends made up of alphahelices connected by a flexible linker region allowing it exhibit a hinge-like motion. A flexible 6PAX molecule was aligned with a rigid 1PDN molecule. Apo Calbindin / EF Hand Domain Protein (1CDN-1H8B) Calbindin is a calcium binding protein that is involved in the uptake, transportation and absorption of calcium in the body. This protein belongs to a super family known as the EF hand super family. Calbindin has two EF hand domains and a short linker region. Small shear movements in the helices and loops exhibit the flexibility of this molecule. A flexible Calbindin molecule was structurally aligned with a rigid molecule from the EF hand super family called alpha-actinin. The results of comparing FlexSADRA to a rigid alignment algorithm and the FlexProt algorithm are summarized in the table below:

5 Test DOF (% Retained) RMSD Energy Flex-Rigid FlexProt FlexSADRA Rigid FlexProt FlexSADRA Rigid FlexSADRA (Min) 6PAX-1PDN (85) CDN-1H8B (65) Table 1. Summary of results for FlexSADRA vs Rigid and FlexProt protein structural alignments. DOF (% Retained) number of degrees of freedom and the amount of original data retained by using the degrees of freedom. RMSD root mean squares deviation in Angstroms. Energy - the conformational energy for the rigid structure and the minimized FlexSADRA flexed structure. Figure 1. FlexSADRA flexible structural alignment of 1PDN and 6PAX. Left - Rigid Alignment. Middle - FlexSADRA Alignment. Right - Minimized Alignment. 1PDN (red) 6PAX (blue). Figure 2. FlexSADRA flexible structural alignment of 1CDN and 1H8B. Left - Rigid Alignment. Right - FlexSADRA Alignment. 1H8B (red) 1CDN (blue). 8. DISCUSSION The results of the tests indicate that the FlexSADRA algorithm is able to perform flexible structural alignment in a reduced dimensionality problem space. In general, the amount of reduction in the dimensionality of the data was 50 to 70%. In many cases, the number of dimensions decreased from a few thousand to only a few hundred dimensions. The flexible structural alignments results in a significant improvement in RMSD values. The improvement is more than 7 Å for 6PAX and 1PDN and more than 2 Å for 1CDN and 1H8B. Without the added flexibility, the structures would seem to be very dissimilar. However, after allowing flexibility the RMSD decreases to less than 2 Å in both cases indicating that the molecules are indeed structurally very similar. The energy values for the flexed 1PDN protein prior to energy minimization was high with respect to its original energy value. This would indicate that the flexed 1PDN structure does not represent a likely flexed conformation. In order to determine if this was the case, a short energy minimization simulation was run on the flexed structure. After minimization, the RMSD was compared with the pre-minimized flexed structure. The difference is about 0.7 Å. This indicates that the pre-minimized and minimized structures are almost identical. The offending areas may be a result of the fact that there is a certain amount of information lost when using dimensionality reduction techniques. Another factor could be that missing coordinate values in the alignment must be estimated and poorly estimated coordinate values will increase the energy value. It can also be case that the flexed structure has a low energy value as in the case of 1CDN and 1H8B. Since the structure already had an acceptable energy value, it was not minimized further. Overall, the RMSD values obtained using FlexSADRA and FlexProt are similar, however, the algorithms are very different. The biggest difference is that FlexProt performs the structural alignment using the original high dimensional atomic coordinates. The algorithm is not trivial and the alignment consists of a series of steps each involving a large number of computations. In addition, the output of FlexProt is a set of disconnected rigid fragments of the flexed protein. It is not clear if this structure corresponds to a structurally feasible conformation. An energy value analysis should be performed on the flexed structure, however it is not possible to do so with most energy calculators since they

6 require connected structures as input. Also, the atomic coordinates for the hinge areas are not provided. In many cases, hinge-like or connecting areas are flexible loops and are still important parts of the molecule. Another problem is that molecules do not move as disconnected rigid fragments. As a result, FlexProt only provides a disjoint partial view of the flexed protein. The FlexSADRA algorithm provides a complete picture of the flexed protein. Flexed areas are not modeled as isolated regions joining rigid fragments, but apply to the molecule as a whole. Moreover, FlexSADRA is a data driven algorithm and flexes proteins according to the given data. This is in contrast with the FlexProt algorthim, which does not use any experimental or simulated flexible data to determine its flexible alignments. 9. CONCLUSION While a few researchers have developed algorithms to address the flexible structural alignment problem, none have tried to address it using a dimensionality reduction approach. In this paper, an algorithm called FlexSADRA has been introduced for the flexible structural alignment of three dimensional protein structures. This algorithm uses a dimensionality reduction approach to model protein flexibility motion and to perform the structural alignment. It has been tested on protein molecules that have previously been structurally aligned or studied in the literature. The results of the tests show that flexible structural alignment can be performed in a reduced dimensionality problem space. The results are better than the results obtained from rigid structural alignments and comparable to the results from another flexible structural alignment algorithm called FlexProt. However in contrast to these algorithms, the FlexSADRA algorithm is simpler, involves significantly fewer degrees of freedom, and only deals with feasible structures. REFERENCES 1. M. Turk and A. Pentland, Eigenfaces for recognition, J. Cog. Neu., vol. 3, pp , H. Hotelling, Analysis of a complex of statistical variables into principal components, J. Educ. Psych., vol. 24, pp , I. Jolliffe, Principal Component Analysis. New York: Springer, 2nd ed., M. Teodoro, G. P. Jr., and L. Kavraki., A dimensionality reduction approach to modeling protein flexibility, In Proc. ACM Int. Conf. on Computational Biology (RECOMB), pp , A. Goldstein, L. Harmon, and A. Lesk, Identification of human faces, Proc. IEEE, vol. 9, pp , S. Hui and M. Shakeel, An investigative approach into dimensionality reduction techniques for protein flexibility modeling. ISMB 2004 Poster Presentation H. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. Shindyalov, and P. Bourne, The protein data bank, Nucleic Acids Research, vol. 28, pp , L. Kal, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, J. P. N. Krawetz, A. Shinozaki, K. Varadarajan, and K. Schulten, Namd - not another molecular dynamics J. M. Haile, Molecular Dynamics Simulation: Elementary Methods. John Wiley & Sons, Inc., A. Amadei, A. B. M. Linssen, and H. J. C. Berendsen, Essential dynamics of proteins, Proteins: Structure, Function, and Genetics, vol. 17, p , W. Kabsch, A solution for the best rotation to relate two sets of vectors, Acta Cryst., vol. A32, p. 922, W. Wriggers and K. Schulten, Protein domain movements: Detection of rigid domains and visualization of effective rotations in comparisons of atomic coordinates, Proteins: Structure, Function, and Genetics, vol. 29, pp. 1 14, R. Little and D. Rubin, Statistical Analysis with Missing Data. John Wiley & Sons, Inc., S. Hui, Flexsadra: Flexible structural alignment using a dimensionality reduction approach, Master s thesis, University of Waterloo, School of Computer Science, Faculty of Mathematics, Sept M. Shatsky, H. Wolfson, and R. Nussinov, Flexible protein alignment and hinge detection, Proteins: Structure, Function, and Genetics, vol. 48, pp , 2002.

Protein structure similarity based on multi-view images generated from 3D molecular visualization

Protein structure similarity based on multi-view images generated from 3D molecular visualization Protein structure similarity based on multi-view images generated from 3D molecular visualization Chendra Hadi Suryanto, Shukun Jiang, Kazuhiro Fukui Graduate School of Systems and Information Engineering,

More information

Better Bond Angles in the Protein Data Bank

Better Bond Angles in the Protein Data Bank Better Bond Angles in the Protein Data Bank C.J. Robinson and D.B. Skillicorn School of Computing Queen s University {robinson,skill}@cs.queensu.ca Abstract The Protein Data Bank (PDB) contains, at least

More information

Introduction to" Protein Structure

Introduction to Protein Structure Introduction to" Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Learning Objectives Outline the basic levels of protein structure.

More information

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007 Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

StructuralBiology. October 18, 2018

StructuralBiology. October 18, 2018 StructuralBiology October 18, 2018 1 Lecture 18: Computational Structural Biology CBIO (CSCI) 4835/6835: Introduction to Computational Biology 1.1 Overview and Objectives Even though we re officially moving

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Selecting protein fuzzy contact maps through information and structure measures

Selecting protein fuzzy contact maps through information and structure measures Selecting protein fuzzy contact maps through information and structure measures Carlos Bousoño-Calzón Signal Processing and Communication Dpt. Univ. Carlos III de Madrid Avda. de la Universidad, 30 28911

More information

Basics of protein structure

Basics of protein structure Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans From: ISMB-97 Proceedings. Copyright 1997, AAAI (www.aaai.org). All rights reserved. ANOLEA: A www Server to Assess Protein Structures Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans Facultés

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

Computing RMSD and fitting protein structures: how I do it and how others do it

Computing RMSD and fitting protein structures: how I do it and how others do it Computing RMSD and fitting protein structures: how I do it and how others do it Bertalan Kovács, Pázmány Péter Catholic University 03/08/2016 0. Introduction All the following algorithms have been implemented

More information

Introducing Hippy: A visualization tool for understanding the α-helix pair interface

Introducing Hippy: A visualization tool for understanding the α-helix pair interface Introducing Hippy: A visualization tool for understanding the α-helix pair interface Robert Fraser and Janice Glasgow School of Computing, Queen s University, Kingston ON, Canada, K7L3N6 {robert,janice}@cs.queensu.ca

More information

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to: System 2 : Modelling & Recognising Modelling and Recognising Classes of Classes of Shapes Shape : PDM & PCA All the same shape? System 1 (last lecture) : limited to rigidly structured shapes System 2 :

More information

Protein structure forces, and folding

Protein structure forces, and folding Harvard-MIT Division of Health Sciences and Technology HST.508: Quantitative Genomics, Fall 2005 Instructors: Leonid Mirny, Robert Berwick, Alvin Kho, Isaac Kohane Protein structure forces, and folding

More information

Structural Bioinformatics (C3210) Molecular Docking

Structural Bioinformatics (C3210) Molecular Docking Structural Bioinformatics (C3210) Molecular Docking Molecular Recognition, Molecular Docking Molecular recognition is the ability of biomolecules to recognize other biomolecules and selectively interact

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1. Protein Structure Analysis and Verification Course S-114.2500 Basics for Biosystems of the Cell exercise work Maija Nevala, BIO, 67485U 16.1.2008 1. Preface When faced with an unknown protein, scientists

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

file:///biology Exploring Life/BiologyExploringLife04/

file:///biology Exploring Life/BiologyExploringLife04/ Objectives Identify carbon skeletons and functional groups in organic molecules. Relate monomers and polymers. Describe the processes of building and breaking polymers. Key Terms organic molecule inorganic

More information

CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004

CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004 CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004 Lecture #2: 1 April 2004 Topics: Kinematics : Concepts and Results Kinematics of Ligands and

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

Automated Assignment of Backbone NMR Data using Artificial Intelligence

Automated Assignment of Backbone NMR Data using Artificial Intelligence Automated Assignment of Backbone NMR Data using Artificial Intelligence John Emmons στ, Steven Johnson τ, Timothy Urness*, and Adina Kilpatrick* Department of Computer Science and Mathematics Department

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran Geometrical Concept-reduction in conformational space.and his Φ-ψ Map G. N. Ramachandran Communication paths in trna-synthetase: Insights from protein structure networks and MD simulations Saraswathi Vishveshwara

More information

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major

More information

Bioinformatics. Macromolecular structure

Bioinformatics. Macromolecular structure Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain

More information

DNA Structure. Voet & Voet: Chapter 29 Pages Slide 1

DNA Structure. Voet & Voet: Chapter 29 Pages Slide 1 DNA Structure Voet & Voet: Chapter 29 Pages 1107-1122 Slide 1 Review The four DNA bases and their atom names The four common -D-ribose conformations All B-DNA ribose adopt the C2' endo conformation All

More information

Protein Structure Prediction and Display

Protein Structure Prediction and Display Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Pymol Practial Guide

Pymol Practial Guide Pymol Practial Guide Pymol is a powerful visualizor very convenient to work with protein molecules. Its interface may seem complex at first, but you will see that with a little practice is simple and powerful

More information

Deriving Principal Component Analysis (PCA)

Deriving Principal Component Analysis (PCA) -0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.

More information

Understanding Protein Flexibility Through. Dimensionality Reduction

Understanding Protein Flexibility Through. Dimensionality Reduction Understanding Protein Flexibility Through Dimensionality Reduction Miguel L. Teodoro mteodoro@rice.edu tel: 713-348-3051 Department of Biochemistry and Cell Biology and Department of Computer Science Rice

More information

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small

More information

Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach

Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach Author manuscript, published in "Journal of Computational Intelligence in Bioinformatics 2, 2 (2009) 131-146" Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach Omar GACI and Stefan

More information

Computing Artificial Backbones of Hydrogen Atoms in order to Discover Protein Backbones

Computing Artificial Backbones of Hydrogen Atoms in order to Discover Protein Backbones Computing Artificial Backbones of Hydrogen Atoms in order to Discover Protein Backbones C. Lavor A. Mucherino L. Liberti and N. Maculan Dept. of Applied Mathematics (IMECC-UNICAMP), State University of

More information

Analysis and Prediction of Protein Structure (I)

Analysis and Prediction of Protein Structure (I) Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Experimental Techniques in Protein Structure Determination

Experimental Techniques in Protein Structure Determination Experimental Techniques in Protein Structure Determination Homayoun Valafar Department of Computer Science and Engineering, USC Two Main Experimental Methods X-Ray crystallography Nuclear Magnetic Resonance

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

Protein Structure Overlap

Protein Structure Overlap Protein Structure Overlap Maximizing Protein Structural Alignment in 3D Space Protein Structure Overlap Motivation () As mentioned several times, we want to know more about protein function by assessing

More information

Tutorial 1 Geometry, Topology, and Biology Patrice Koehl and Joel Hass

Tutorial 1 Geometry, Topology, and Biology Patrice Koehl and Joel Hass Tutorial 1 Geometry, Topology, and Biology Patrice Koehl and Joel Hass University of California, Davis, USA http://www.cs.ucdavis.edu/~koehl/ims2017/ Biology = Multiscale. 10 6 m 10 3 m m mm µm nm Å ps

More information

A Modular NMF Matching Algorithm for Radiation Spectra

A Modular NMF Matching Algorithm for Radiation Spectra A Modular NMF Matching Algorithm for Radiation Spectra Melissa L. Koudelka Sensor Exploitation Applications Sandia National Laboratories mlkoude@sandia.gov Daniel J. Dorsey Systems Technologies Sandia

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

Eigenfaces. Face Recognition Using Principal Components Analysis

Eigenfaces. Face Recognition Using Principal Components Analysis Eigenfaces Face Recognition Using Principal Components Analysis M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, 3(1), pp. 71-86, 1991. Slides : George Bebis, UNR

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of omputer Science San José State University San José, alifornia, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Pairwise Sequence Alignment Homology

More information

Ant Colony Approach to Predict Amino Acid Interaction Networks

Ant Colony Approach to Predict Amino Acid Interaction Networks Ant Colony Approach to Predict Amino Acid Interaction Networks Omar Gaci, Stefan Balev To cite this version: Omar Gaci, Stefan Balev. Ant Colony Approach to Predict Amino Acid Interaction Networks. IEEE

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*

More information

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture

More information

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Supporting Information Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Christophe Schmitz, Robert Vernon, Gottfried Otting, David Baker and Thomas Huber Table S0. Biological Magnetic

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Section II Understanding the Protein Data Bank

Section II Understanding the Protein Data Bank Section II Understanding the Protein Data Bank The focus of Section II of the MSOE Center for BioMolecular Modeling Jmol Training Guide is to learn about the Protein Data Bank, the worldwide repository

More information

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Jamie Duke 1,2 and Carlos Camacho 3 1 Bioengineering and Bioinformatics Summer Institute,

More information

Overview & Applications. T. Lezon Hands-on Workshop in Computational Biophysics Pittsburgh Supercomputing Center 04 June, 2015

Overview & Applications. T. Lezon Hands-on Workshop in Computational Biophysics Pittsburgh Supercomputing Center 04 June, 2015 Overview & Applications T. Lezon Hands-on Workshop in Computational Biophysics Pittsburgh Supercomputing Center 4 June, 215 Simulations still take time Bakan et al. Bioinformatics 211. Coarse-grained Elastic

More information

Protein Structure: Data Bases and Classification Ingo Ruczinski

Protein Structure: Data Bases and Classification Ingo Ruczinski Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Protein Structure Determination

Protein Structure Determination Protein Structure Determination Given a protein sequence, determine its 3D structure 1 MIKLGIVMDP IANINIKKDS SFAMLLEAQR RGYELHYMEM GDLYLINGEA 51 RAHTRTLNVK QNYEEWFSFV GEQDLPLADL DVILMRKDPP FDTEFIYATY 101

More information

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1 Zhou Pei-Yuan Centre for Applied Mathematics, Tsinghua University November 2013 F. Piazza Center for Molecular Biophysics and University of Orléans, France Selected topic in Physical Biology Lecture 1

More information

Protein structure sampling based on molecular dynamics and improvement of docking prediction

Protein structure sampling based on molecular dynamics and improvement of docking prediction 1 1 1, 2 1 NMR X Protein structure sampling based on molecular dynamics and improvement of docking prediction Yusuke Matsuzaki, 1 Yuri Matsuzaki, 1 Masakazu Sekijima 1, 2 and Yutaka Akiyama 1 When a protein

More information

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia

More information

Optimization of the Sliding Window Size for Protein Structure Prediction

Optimization of the Sliding Window Size for Protein Structure Prediction Optimization of the Sliding Window Size for Protein Structure Prediction Ke Chen* 1, Lukasz Kurgan 1 and Jishou Ruan 2 1 University of Alberta, Department of Electrical and Computer Engineering, Edmonton,

More information

Protein Science (1997), 6: Cambridge University Press. Printed in the USA. Copyright 1997 The Protein Society

Protein Science (1997), 6: Cambridge University Press. Printed in the USA. Copyright 1997 The Protein Society 1 of 5 1/30/00 8:08 PM Protein Science (1997), 6: 246-248. Cambridge University Press. Printed in the USA. Copyright 1997 The Protein Society FOR THE RECORD LPFC: An Internet library of protein family

More information

RNA and Protein Structure Prediction

RNA and Protein Structure Prediction RNA and Protein Structure Prediction Bioinformatics: Issues and Algorithms CSE 308-408 Spring 2007 Lecture 18-1- Outline Multi-Dimensional Nature of Life RNA Secondary Structure Prediction Protein Structure

More information

Details of Protein Structure

Details of Protein Structure Details of Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Anne Mølgaard, Kemisk Institut, Københavns Universitet Learning Objectives

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

The Mathematics of Facial Recognition

The Mathematics of Facial Recognition William Dean Gowin Graduate Student Appalachian State University July 26, 2007 Outline EigenFaces Deconstruct a known face into an N-dimensional facespace where N is the number of faces in our data set.

More information

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel

More information

Visualization of Macromolecular Structures

Visualization of Macromolecular Structures Visualization of Macromolecular Structures Present by: Qihang Li orig. author: O Donoghue, et al. Structural biology is rapidly accumulating a wealth of detailed information. Over 60,000 high-resolution

More information

Eigenimaging for Facial Recognition

Eigenimaging for Facial Recognition Eigenimaging for Facial Recognition Aaron Kosmatin, Clayton Broman December 2, 21 Abstract The interest of this paper is Principal Component Analysis, specifically its area of application to facial recognition

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Preparing a PDB File

Preparing a PDB File Figure 1: Schematic view of the ligand-binding domain from the vitamin D receptor (PDB file 1IE9). The crystallographic waters are shown as small spheres and the bound ligand is shown as a CPK model. HO

More information

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology Tools for Cryo-EM Map Fitting Paul Emsley MRC Laboratory of Molecular Biology April 2017 Cryo-EM model-building typically need to move more atoms that one does for crystallography the maps are lower resolution

More information

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature17991 Supplementary Discussion Structural comparison with E. coli EmrE The DMT superfamily includes a wide variety of transporters with 4-10 TM segments 1. Since the subfamilies of the

More information

Reconstruction of Protein Backbone with the α-carbon Coordinates *

Reconstruction of Protein Backbone with the α-carbon Coordinates * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1107-1119 (2010) Reconstruction of Protein Backbone with the α-carbon Coordinates * JEN-HUI WANG, CHANG-BIAU YANG + AND CHIOU-TING TSENG Department of

More information

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING:

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING: DISCRETE TUTORIAL Agustí Emperador Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING: STRUCTURAL REFINEMENT OF DOCKING CONFORMATIONS Emperador

More information

Orientational degeneracy in the presence of one alignment tensor.

Orientational degeneracy in the presence of one alignment tensor. Orientational degeneracy in the presence of one alignment tensor. Rotation about the x, y and z axes can be performed in the aligned mode of the program to examine the four degenerate orientations of two

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid. 1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the

More information

The protein folding problem consists of two parts:

The protein folding problem consists of two parts: Energetics and kinetics of protein folding The protein folding problem consists of two parts: 1)Creating a stable, well-defined structure that is significantly more stable than all other possible structures.

More information

Manifold Alignment using Procrustes Analysis

Manifold Alignment using Procrustes Analysis Chang Wang chwang@cs.umass.edu Sridhar Mahadevan mahadeva@cs.umass.edu Computer Science Department, University of Massachusetts, Amherst, MA 13 USA Abstract In this paper we introduce a novel approach

More information

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To

More information

Molecular biologists need to describe, compare, and

Molecular biologists need to describe, compare, and BACKGROUND PHOTODISC, FOREGROUND IMAGE: U.S. DEPARTMENT OF ENERGY GENOMICS: GTL PROGRAM, HTTP://WWW.ORNL.GOV.HGMIS BY ROGER E. ISON, SVEN HOVMÖLLER, AND ROBERT H. KRETSINGER Proteins and Their Shape Strings

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Modeling Biological Systems Opportunities for Computer Scientists

Modeling Biological Systems Opportunities for Computer Scientists Modeling Biological Systems Opportunities for Computer Scientists Filip Jagodzinski RBO Tutorial Series 25 June 2007 Computer Science Robotics & Biology Laboratory Protein: πρώτα, "prota, of Primary Importance

More information

Essential dynamics sampling of proteins. Tuorial 6 Neva Bešker

Essential dynamics sampling of proteins. Tuorial 6 Neva Bešker Essential dynamics sampling of proteins Tuorial 6 Neva Bešker Relevant time scale Why we need enhanced sampling? Interconvertion between basins is infrequent at the roomtemperature: kinetics and thermodynamics

More information

Universal Similarity Measure for Comparing Protein Structures

Universal Similarity Measure for Comparing Protein Structures Marcos R. Betancourt Jeffrey Skolnick Laboratory of Computational Genomics, The Donald Danforth Plant Science Center, 893. Warson Rd., Creve Coeur, MO 63141 Universal Similarity Measure for Comparing Protein

More information

Molecular dynamics simulation of Aquaporin-1. 4 nm

Molecular dynamics simulation of Aquaporin-1. 4 nm Molecular dynamics simulation of Aquaporin-1 4 nm Molecular Dynamics Simulations Schrödinger equation i~@ t (r, R) =H (r, R) Born-Oppenheimer approximation H e e(r; R) =E e (R) e(r; R) Nucleic motion described

More information

Ohio Tutorials are designed specifically for the Ohio Learning Standards to prepare students for the Ohio State Tests and end-ofcourse

Ohio Tutorials are designed specifically for the Ohio Learning Standards to prepare students for the Ohio State Tests and end-ofcourse Tutorial Outline Ohio Tutorials are designed specifically for the Ohio Learning Standards to prepare students for the Ohio State Tests and end-ofcourse exams. Biology Tutorials offer targeted instruction,

More information

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Prediction, Engineering & Design CHEM 430 Protein Structure Prediction, Engineering & Design CHEM 430 Eero Saarinen The free energy surface of a protein Protein Structure Prediction & Design Full Protein Structure from Sequence - High Alignment

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information